I’ve worked on a few API operations maintaining our metadata holdings.
One thing I worked on was using the API to synchronise metadata holdings between Aristotle (our master) and Cloudera Navigator in a Data Lake environment (mirror containing only a subset of Aristotle’s holdings).
To do this, I used graphQL to gather all the information I needed about a group of distributions of interest.
- With a nominated list of distribution UUID’s, call graphQL to get from each distribution a complete list of:
- Data Elements contained therein
- Value Domains of each Data Element
- Permissible Values of that Value Domain if applicable
- For each distribution, generate a json file compatible with the Cloudera Navigator API.
- FTP the json files to a Unix server which has access to Cloudera.
- BASH script on the unix server calls the Cloudera Navigator restful API and store each json file containing all the new Aristotle data.
As far as technologies go, to invoke graphQL in Aristotle I’m just using VBA.
To load JSON files into Cloudera, I’m using a BASH script invoked from a Unix machine.
Are you interested in examples of the graphQL query I call to gather distribution/dataelement/valuedomain information?
Other things I’ve worked are aimed at making maintaining metadata within Aristotle a little easier. For example, we had to consolidate two identical value domains that were linked to a large number of data elements. It made no sense to have two separate value domains, and it would be painful to use the Aristotle UI to click through every link and merge one into the other. So I just wrote a few API calls to discover all the links that need updating, and make the changes (swapping the old value domain for the new). Similar routines swap data elements within distributions and so on. But this isn’t about integration with legacy systems, just maintenance.
Hope this is useful. If there’s something more specific that I can go into detail with, just let me know.