Documenting location of Distributions sourced from Hadoop (HDFS)

I have an API distribution loader working with Aristotle, and I’m extracting table data from Cloudera Atlas (which offers some insight to metadata concerning a Hadoop file system). One of the fields I want to copy over is the table’s location, which in Atlas might look something like this:

“location”: “hdfs://hdfs-box/path/morepath/rrr_priv/data/country_table”

In Aristotle, I thought a good spot for this might be the Distribution field “Access URL”, yet this seems to prefer a regular internet path instead of something that makes sense for a Hadoop file system.

Here’s the error if you try a regular edit screen:

Access URL Aristotle error

Is there another field this location data would be better placed in?
Or should “Access URL” be updated to allow for locations unique to hadoop?

Hi Michael,

I’ve looked into this - the URL fields we use were using an overly strict validator (basically rejecting anything that wasn’t http/https). We’ve fixed this now, and it’ll be in the release next Tuesday (14/09/2021).

Thanks as always,
Dylan

1 Like

Hi Michael,

Just letting you know that we have released this fix last Tuesday. I hope that lets you continue with your API distribution loader. As always, let us know if you have any other queries - Aristotle is always happy to help.

1 Like