Interact with the Query server (e.g. submit a SQL query, upload some files in cloud storages, search for a table…) via a REST API.
Users authenticate with the same credentials as they would do in the Browser login page.
The API can be called directly via REST.
First authenticate with your account credentials and get a token. Then provide the token in all following requests as header, e.g.
curl -X POST https://demo.gethue.com/api/v1/editor/execute/hive --data 'statement=SHOW TABLES' -H "Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ0b2tlbl90eXBlIjoiYWNjZXNzIiwiZXhwIjoxNjIxNjM5NjMxLCJqdGkiOiI0NTY3NTA4MzM5YjY0MjFmYTMzZDJjMzViZWUyMDAyMCIsInVzZXJfaWQiOjF9.qrMNrr69eo38dOsV2aYp8k6WqBeyJZkbSuavxA_o_kM"
The default content type is form data, e.g.:
-H "Content-Type: application/x-www-form-urlencoded" -d 'username=demo&password=demo'
It is possible to submit data in JSON format for the calls also reading the data via request.body
:
-H "Content-Type: application/json" -d '{"username": "demo", "password": "demo"}'
Calling without credentials:
curl -X POST https://demo.gethue.com/api/v1/query/create_notebook -H "Content-Type: application/json"
{"detail":"Authentication credentials were not provided."}
Authenticating and getting a JWT token:
curl -X POST https://demo.gethue.com/api/v1/token/auth/ -H "Content-Type: application/json" -d '{"username": "demo", "password": "demo"}'
{"refresh":"eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ0b2tlbl90eXBlIjoicmVmcmVzaCIsImV4cCI6MTYyMTcyNDYzMSwianRpIjoiOGM0NDRjYzRhN2VhNGMxZDliMGZhNmU1YzUyMjM1MjkiLCJ1c2VyX2lkIjoxfQ.t6t7_eYrNhpGN3-Jz5MDLXM8JtGP7V9Y9lacOTInqqQ","access":"eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ0b2tlbl90eXBlIjoiYWNjZXNzIiwiZXhwIjoxNjIxNjM4NTMxLCJqdGkiOiJhZjgwN2E0ZjBmZDI0ZWMxYWQ2NTUzZjEyMjIyYzU4YyIsInVzZXJfaWQiOjF9.dQ1P3hbzSytp9-o8bWlcOcwrdwRVy95M2Eolph92QMA"}
Re-using the token when making actual calls:
curl -X POST https://demo.gethue.com/api/v1/query/create_notebook -H "Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ0b2tlbl90eXBlIjoiYWNjZXNzIiwiZXhwIjoxNjIxNjM5NjMxLCJqdGkiOiI0NTY3NTA4MzM5YjY0MjFmYTMzZDJjMzViZWUyMDAyMCIsInVzZXJfaWQiOjF9.qrMNrr69eo38dOsV2aYp8k6WqBeyJZkbSuavxA_o_kM"
{"status": 0, "notebook": {"name": "My Notebook", "uuid": "1e23314f-b01e-4c18-872f-dc143475f063", "description": "", "type": "notebook", "isSaved": false, "isManaged": false, "skipHistorify": false, "sessions": [], "snippets": [], "directoryUuid": null}}
In this code snippet, we will use the requests library:
pip install requests
And then:
import json
import requests
session = requests.Session()
data = {
'username': 'demo',
'password': 'demo',
}
response = session.post("https://demo.gethue.com/api/v1/token/auth", data=data)
print('Auth: %s %s' % ('success' if response.status_code == 200 else 'error', response.status_code))
token = json.loads(response.content)['access']
print('Token: %s' % token)
response = requests.post(
'https://demo.gethue.com/api/v1/query/autocomplete',
headers={
'Authorization': 'Bearer %s' % token,
"Content-Type": "application/x-www-form-urlencoded"
},
data={'snippet': json.dumps({"type":"1"})}
)
print(response.status_code)
print(response.text)
In the meantime, with Axios:
<script src="https://unpkg.com/[email protected]/dist/axios.min.js"></script>
<script type="text/javascript">
const API_URL = "https://demo.gethue.com";
axios.defaults.baseURL = API_URL;
axios.post('api/v1/token/auth/', {username: "hue", password: "hue"}).then(function(data) {
console.log(data['data']);
// Util to check if cached token is still valid before asking to auth for a new one
axios.post('api/v1/token/verify/', {token: data['access']});
axios.defaults.headers.common['Authorization'] = 'Bearer ' + data['access'];
}).then(function() {
axios.post('api/v1/query/sqlite', {statement:"SELECT 1000, 1001"}).then(function(data) {
console.log(data['data']);
});
axios.post('api/v1/connectors/types/').then(function(data) {
console.log(data['data']);
});
});
</script>
The API authenticates via the authentication backends of the server (same as going via the login page). This is consistent and users are free to interact via their browsers or API.
Then a JWT token is returned and needs to be passed as a bearer in the headers for all the API calls.
Wrong credentials: on bad authentication, it will return a 401 unauthorized
response, e.g.:
curl -X POST https://demo.gethue.com/api/v1/editor/create_notebook -H "Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ0b2tlbl90eXBlIjoiYWNjZXNzIiwiZXhwIjoxNjIxNjM5NjMxLCJqdGkiOiI0NTY3NTA4MzM5YjY0MjFmYTMzZDJjMzViZWUyMDAyMCIsInVzZXJfaWQiOjF9.qrMNrr69eo38dOsV2aYp8k6WqBeyJZkbSuavxA_o_kM"
{"detail":"Given token not valid for any token type","code":"token_not_valid","messages":[{"token_class":"AccessToken","token_type":"access","message":"Token is invalid or expired"}]}
[09/Jul/2021 23:58:40 -0700] access INFO demo.gethue.com -anon- - "POST /api/v1/editor/create_notebook HTTP/1.1" returned in 2ms 401 183 (mem: 124mb)
Provide login credentials and get a JWT token:
curl -X POST https://demo.gethue.com/api/v1/token/auth -d 'username=demo&password=demo'
{"refresh":"eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ0b2tlbl90eXBlIjoicmVmcmVzaCIsImV4cCI6MTYyMTcyNDYzMSwianRpIjoiOGM0NDRjYzRhN2VhNGMxZDliMGZhNmU1YzUyMjM1MjkiLCJ1c2VyX2lkIjoxfQ.t6t7_eYrNhpGN3-Jz5MDLXM8JtGP7V9Y9lacOTInqqQ","access":"eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ0b2tlbl90eXBlIjoiYWNjZXNzIiwiZXhwIjoxNjIxNjM4NTMxLCJqdGkiOiJhZjgwN2E0ZjBmZDI0ZWMxYWQ2NTUzZjEyMjIyYzU4YyIsInVzZXJfaWQiOjF9.dQ1P3hbzSytp9-o8bWlcOcwrdwRVy95M2Eolph92QMA"}
And keep the access
token as the value of the bearer header in the API calls.
The validity (i.e. did it expire?) of an access
token can be verified:
curl -X POST https://demo.gethue.com/api/v1/token/verify/ -d 'token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ0b2tlbl90eXBlIjoiYWNjZXNzIiwiZXhwIjoxNjIxNjM4NTMxLCJqdGkiOiJhZjgwN2E0ZjBmZDI0ZWMxYWQ2NTUzZjEyMjIyYzU4YyIsInVzZXJfaWQiOjF9.dQ1P3hbzSytp9-o8bWlcOcwrdwRVy95M2Eolph92QMA'
Similarly, an access
token validity can be extended via a refresh sending the refresh
token obtained in the initial authentication.
curl -X POST https://demo.gethue.com/api/v1/token/refresh/ -d 'refresh=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ0b2tlbl90eXBlIjoicmVmcmVzaCIsImV4cCI6MTYyMTcyNDYzMSwianRpIjoiOGM0NDRjYzRhN2VhNGMxZDliMGZhNmU1YzUyMjM1MjkiLCJ1c2VyX2lkIjoxfQ.t6t7_eYrNhpGN3-Jz5MDLXM8JtGP7V9Y9lacOTInqqQ'
Users can authenticate with their own JWT with the help of custom backend (supporting RSA256). To enable it, add the following in the hue.ini
:
[desktop]
[[auth]]
[[[jwt]]]
key_server_url=https://ext_authz:8000
issuer=<your_external_app>
audience=hue
username_header=sub
Also, to allow Hue to send this JWT to external services like Impala, enable the following flag in hue.ini
:
[desktop]
use_thrift_http_jwt=true
If you wish to implement your own custom auth (having customized connection with external auth server or using different signing algorithm etc.), then you can follow the Django REST Framework custom pluggability and add like this dummy auth.
And then, add it in hue.ini
(comma separated and in order of priority if multiple auth backends present):
[desktop]
[[auth]]
api_auth=<your_own_custom_auth_backend>
Now that we are authenticated, here is how to execute a SHOW TABLES
SQL query via the hive
connector. You could repeat the steps with any query you want, e.g. SELECT * FROM web_logs LIMIT 100
.
Selecting the dialect argument in /api/v1/editor/execute/<dialect>
:
Optional parameter:
For a SHOW TABLES
, first we send the query statement:
curl -X POST https://demo.gethue.com/api/v1/editor/execute/hive --data 'statement=SHOW TABLES'
{"status": 0, "history_id": 17880, "handle": {"statement_id": 0, "session_type": "hive", "has_more_statements": false, "guid": "EUI32vrfTkSOBXET6Eaa+A==\n", "previous_statement_hash": "3070952e55d733fb5bef249277fb8674989e40b6f86c5cc8b39cc415", "log_context": null, "statements_count": 1, "end": {"column": 10, "row": 0}, "session_id": 63, "start": {"column": 0, "row": 0}, "secret": "RuiF0LEkRn+Yok/gjXWSqg==\n", "has_result_set": true, "session_guid": "c845bb7688dca140:859a5024fb284ba2", "statement": "SHOW TABLES", "operation_type": 0, "modified_row_count": null}, "history_uuid": "63ce87ba-ca0f-4653-8aeb-e9f5c1781b78"}
Then check the operation (its value is history_uuid from the execute response) status until its result is ready to fetch:
curl -X POST https://demo.gethue.com/api/v1/editor/check_status --data 'operationId=63ce87ba-ca0f-4653-8aeb-e9f5c1781b78'
{"status": 0, "query_status": {"status": "available", "has_result_set": true}}
And now ask for the resultset of the statement:
curl -X POST https://demo.gethue.com/api/v1/editor/fetch_result_data --data 'operationId=63ce87ba-ca0f-4653-8aeb-e9f5c1781b78'
{"status": 0, "result": {"has_more": true, "type": "table", "meta": [{"comment": "from deserializer", "type": "STRING_TYPE", "name": "tab_name"}], "data": [["adavi"], ["adavi1"], ["adavi2"], ["ambs_feed"], ["apx_adv_deduction_data_process_total"], ["avro_table"], ["avro_table1"], ["bb"], ["bharath_info1"], ["bucknew"], ["bucknew1"], ["chungu"], ["cricket3"], ["cricket4"], ["cricket5_view"], ["cricketer"], ["cricketer_view"], ["cricketer_view1"], ["demo1"], ["demo12345"], ["dummy"], ["embedded"], ["emp"], ["emp1_sept9"], ["emp_details"], ["emp_sept"], ["emp_tbl1"], ["emp_tbl2"], ["empdtls"], ["empdtls_ext"], ["empdtls_ext_v2"], ["employee"], ["employee1"], ["employee_ins"], ["empppp"], ["events"], ["final"], ["flight_data"], ["gopalbhar"], ["guruhive_internaltable"], ["hell"], ["info1"], ["lost_messages"], ["mnewmyak"], ["mortality"], ["mscda"], ["myak"], ["mysample"], ["mysample1"], ["mysample2"], ["network"], ["ods_t_exch_recv_rel_wfz_stat_szy"], ["olympicdata"], ["p_table"], ["partition_cricket"], ["partitioned_user"], ["s"], ["sample"], ["sample_07"], ["sample_08"], ["score"], ["stg_t_exch_recv_rel_wfz_stat_szy"], ["stocks"], ["students"], ["studentscores"], ["studentscores2"], ["t1"], ["table_name"], ["tablex"], ["tabley"], ["temp"], ["test1"], ["test2"], ["test21"], ["test_info"], ["topage"], ["txnrecords"], ["u_data"], ["udata"], ["user_session"], ["user_test"], ["v_empdtls"], ["v_empdtls_ext"], ["v_empdtls_ext_v2"], ["web_logs"]], "isEscaped": true}}
And if we wanted to get the execution log for this statement:
curl -X POST https://demo.gethue.com/api/v1/editor/get_logs --data 'operationId=63ce87ba-ca0f-4653-8aeb-e9f5c1781b78'
{"status": 0, "progress": 5, "jobs": [], "logs": "", "isFullLogs": false}
Same but in Python:
params = {
'statement': 'SELECT 1, 2, 3',
}
response = requests.post(
'https://demo.gethue.com/api/v1/editor/execute/mysql',
headers={
'Authorization': 'Bearer %s' % token,
"Content-Type": "application/x-www-form-urlencoded"
},
data=params
)
print(response.status_code)
print(response.text)
resp_content = json.loads(response.text)
data = {
'operationId': resp_content['history_uuid'],
}
response = requests.post(
'https://demo.gethue.com/api/v1/editor/check_status',
headers={
'Authorization': 'Bearer %s' % token,
"Content-Type": "application/x-www-form-urlencoded"
},
data=data
)
print(response.status_code)
print(response.text)
response = requests.post(
'https://demo.gethue.com/api/v1/editor/fetch_result_data',
headers={
'Authorization': 'Bearer %s' % token,
"Content-Type": "application/x-www-form-urlencoded"
},
data=data
)
print(response.status_code)
print(response.text)
curl -X POST https://demo.gethue.com/api/v1/editor/autocomplete/ -d 'snippet={"type":"hive"}'
{"status": 0, "databases": ["default", "information_schema", "sys"]}
type
from the configured dialects (e.g. hive
) or connector IDs (e.g. 1
)curl -X POST https://demo.gethue.com/api/v1/editor/autocomplete/<DB>/ -d 'snippet={"type":"hive"}'
type
from the configured dialects (e.g. hive
) or connector IDs (e.g. 1
)Describe database API:
curl -X POST https://demo.gethue.com/api/v1/editor/describe/<DB>/ -d 'source_type=mysql'
hive
) or connector IDs (e.g. 1
)curl -X POST https://demo.gethue.com/api/v1/editor/autocomplete/<DB>/<TABLE>/ -d 'snippet={"type":"hive"}'
type
from the configured dialects (e.g. hive
) or connector IDs (e.g. 1
)Describe table API:
curl -X POST https://demo.gethue.com/api/v1/editor/describe/<DB>/<TABLE>/ -d 'source_type=1'
hive
) or connector IDs (e.g. 1
)Analyze API:
curl -X POST https://demo.gethue.com/api/v1/<DIALECT>/analyze/<DB>/<TABLE>/
Sample table data API:
curl -X POST https://demo.gethue.com/api/v1/editor/sample/<DB>/<TABLE>/ -d 'snippet={"type":"hive"}'
type
from the configured dialects (e.g. hive
) or connector IDs (e.g. 1
)curl -X POST https://demo.gethue.com/api/v1/editor/autocomplete/<DB>/<TABLE>/<COL1>/ -d 'snippet={"type":"hive"}'
type
from the configured dialects (e.g. hive
) or connector IDs (e.g. 1
)Analyze API:
curl -X POST https://demo.gethue.com/api/v1/<DIALECT>/analyze/<DB>/<TABLE>/<COL1>/
Sample column data API:
curl -X POST https://demo.gethue.com/api/v1/editor/sample/<DB>/<TABLE>/<COL1>/ -d 'snippet={"type":"hive"}'
type
from the configured dialects (e.g. hive
) or connector IDs (e.g. 1
)Default functions:
curl -X POST https://demo.gethue.com/api/v1/editor/autocomplete -d 'snippet={"type":"hive"}' -d 'operation=functions'
type
from the configured dialects (e.g. hive
) or connector IDs (e.g. 1
)functions
)For a specific database:
curl -X POST https://demo.gethue.com/api/v1/editor/autocomplete/<DB> -d 'snippet={"type":"impala"}' -d 'operation=functions'
type
from the configured dialects (e.g. impala
) or connector IDs (e.g. 1
)functions
)For a specific function/UDF details (e.g. trunc):
curl -X POST https://demo.gethue.com/api/v1/editor/autocomplete/<function_name> -d 'snippet={"type":"hive"}' -d 'operation=function'
type
from the configured dialects (e.g. hive
) or connector IDs (e.g. 1
)function
)We can choose a dialect for doc_type
e.g. impala, mysql, hive, phoenix, etc.
curl -X GET https://demo.gethue.com/api/v1/editor/get_history?doc_type=hive
{"status": 0, "count": 3, "history": [{"name": "", "id": 2008, "uuid": "5b48c678-1224-4863-b523-3baab82402a7", "type": "query-hive", "data": {"statement": "CREATE TABLE w12( Name STRING, Money BIGINT )", "lastExecuted": 1621502970360, "status": "failed", "parentSavedQueryUuid": ""}, "absoluteUrl": "/editor?editor=2008"}, {"name": "", "id": 2006, "uuid": "1cd32ae0-9b61-46ae-8fd4-72c4255209c3", "type": "query-hive", "data": {"statement": "CREATE TABLE q13( Name STRING, Money BIGINT )", "lastExecuted": 1621498889058, "status": "expired", "parentSavedQueryUuid": ""}, "absoluteUrl": "/editor?editor=2006"}, {"name": "", "id": 2003, "uuid": "e5ec1fa4-1a36-4e42-a814-a685b0142223", "type": "query-hive", "data": {"statement": "CREATE TABLE q11( Name STRING, Money BIGINT );\nINSERT INTO q11 VALUES ('abc', 100);", "lastExecuted": 1621498771619, "status": "expired", "parentSavedQueryUuid": ""}, "absoluteUrl": "/editor?editor=2003"}], "message": "History fetched"}
curl -X POST https://demo.gethue.com/api/v1/get_config/
{"app_config": {"editor": {"name": "editor", "displayName": "Editor", "buttonName": "Query", "interpreters": [{"name": "MySQL", "type": "mysql", "id": "mysql", "displayName": "MySQL", "buttonName": "Query", "tooltip": "Mysql Query", "optimizer": "off", "page": "/editor/?type=mysql", "is_sql": true, "is_batchable": true, "dialect": "mysql", "dialect_properties": {}}, {"name": "notebook", "type": "notebook", "displayName": "Notebook", "buttonName": "Notebook", "tooltip": "Notebook", "page": "/notebook", "is_sql": false, "dialect": "notebook"}], "default_limit": 5000, "interpreter_names": ["mysql", "notebook"], "page": "/editor/?type=mysql", "default_sql_interpreter": "mysql"}, "catalogs": [{"name": "MySQL", "type": "mysql", "id": "mysql", "displayName": "MySQL", "buttonName": "Query", "tooltip": "Mysql Query", "page": "/editor/?type=mysql", "is_sql": true, "is_catalog": true}], "browser": {"name": "browser", "displayName": "Browsers", "buttonName": "Browse", "interpreters": [{"type": "hdfs", "displayName": "Files", "buttonName": "Browse", "tooltip": "Files", "page": "/filebrowser/view=%2Fuser%2Fdemo"}, {"type": "tables", "displayName": "Tables", "buttonName": "Browse", "tooltip": "Tables", "page": "/metastore/tables"}, {"type": "yarn", "displayName": "Jobs", "buttonName": "Jobs", "tooltip": "Jobs", "page": "/jobbrowser/"}, {"type": "importer", "displayName": "Importer", "buttonName": "Import", "tooltip": "Importer", "page": "/indexer/importer"}], "interpreter_names": ["hdfs", "tables", "yarn", "importer"]}, "home": {"name": "home", "displayName": "Home", "buttonName": "Documents", "interpreters": [], "page": "/home"}}, "main_button_action": {"name": "MySQL", "type": "mysql", "id": "mysql", "displayName": "MySQL", "buttonName": "Query", "tooltip": "Mysql Query", "optimizer": "off", "page": "/editor/?type=mysql", "is_sql": true, "is_batchable": true, "dialect": "mysql", "dialect_properties": {}}, "button_actions": [{"name": "editor", "displayName": "Editor", "buttonName": "Query", "interpreters": [{"name": "MySQL", "type": "mysql", "id": "mysql", "displayName": "MySQL", "buttonName": "Query", "tooltip": "Mysql Query", "optimizer": "off", "page": "/editor/?type=mysql", "is_sql": true, "is_batchable": true, "dialect": "mysql", "dialect_properties": {}}, {"name": "notebook", "type": "notebook", "displayName": "Notebook", "buttonName": "Notebook", "tooltip": "Notebook", "page": "/notebook", "is_sql": false, "dialect": "notebook"}], "default_limit": 5000, "interpreter_names": ["mysql", "notebook"], "page": "/editor/?type=mysql", "default_sql_interpreter": "mysql"}], "default_sql_interpreter": "mysql", "cluster_type": "direct", "has_computes": false, "hue_config": {"enable_sharing": true, "is_admin": true}, "clusters": [{"id": "default", "name": "default", "type": "direct", "credentials": {}}], "documents": {"types": ["directory", "gist", "query-mysql"]}, "status": 0}
Hue's File Browser offer uploads, downloads, operations (create, delete, chmod…) and listing of data in HDFS (hdfs://
or no prefix), S3 (s3a://
prefix), ADLS (adls://
or abfs://
prefixes), Ozone (ofs://
prefix) storages.
Get the filesystems details such as configured filesystems in Hue which user has access to and its home directories:
curl -X GET https://demo.gethue.com/api/v1/storage/filesystems
[{"file_system": "hdfs", "user_home_directory": "/user/demo"}, {"file_system": "s3a", "user_home_directory": "s3a://<some_s3_path>"}, {"file_system": "abfs", "user_home_directory": "abfs://<some_abfs_path>"}, {"file_system": "ofs", "user_home_directory": "ofs://<some_ofs_path>"}]
Here is how to list the content of a path, here a S3 bucket s3a://demo-gethue
:
curl -X GET https://demo.gethue.com/api/v1/storage/view=s3a://demo-gethue
{
...........
"files": [
{
"humansize": "0\u00a0bytes",
"url": "/filebrowser/view=s3a%3A%2F%2Fdemo-hue",
"stats": {
"size": 0,
"aclBit": false,
"group": "",
"user": "",
"mtime": null,
"path": "s3a://demo-gethue",
"atime": null,
"mode": 16895
},
"name": "demo-hue",
"mtime": "",
"rwx": "drwxrwxrwx",
"path": "s3a://demo-gethue",
"is_sentry_managed": false,
"type": "dir",
"mode": "40777"
},
{
"humansize": "0\u00a0bytes",
"url": "/filebrowser/view=S3A%3A%2F%2F",
"stats": {
"size": 0,
"aclBit": false,
"group": "",
"user": "",
"mtime": null,
"path": "S3A://",
"atime": null,
"mode": 16895
},
"name": ".",
"mtime": "",
"rwx": "drwxrwxrwx",
"path": "S3A://",
"is_sentry_managed": false,
"type": "dir",
"mode": "40777"
}
],
...........
}
Some of the parameters:
E.g. ?pagesize=45&pagenum=1&filter=&sortby=name&descending=false
How to get the some of the file content and its stats/metadata.
Example with a S3 file:
curl -X GET https://demo.gethue.com/api/v1/storage/view=s3a://demo-gethue/data/web_logs/index_data.csv
{
"show_download_button": true,
"is_embeddable": false,
"editable": false,
"mtime": "October 31, 2016 03:34 PM",
"rwx": "-rw-rw-rw-",
"path": "s3a://demo-gethue/data/web_logs/index_data.csv",
"stats": {
"size": 6199593,
"aclBit": false,
...............
"contents": "code,protocol,request,app,user_agent_major,region_code,country_code,id,city,subapp,latitude,method,client_ip, user_agent_family,bytes,referer,country_name,extension,url,os_major,longitude,device_family,record,user_agent,time,os_family,country_code3
200,HTTP/1.1,GET /metastore/table/default/sample_07 HTTP/1.1,metastore,,00,SG,8836e6ce-9a21-449f-a372-9e57641389b3,Singapore,table,1.2931000000000097,GET,128.199.234.236,Other,1041,-,Singapore,,/metastore/table/default/sample_07,,103.85579999999999,Other,"demo.gethue.com:80 128.199.234.236 - - [04/May/2014:06:35:49 +0000] ""GET /metastore/table/default/sample_07 HTTP/1.1"" 200 1041 ""-"" ""Mozilla/5.0 (compatible; phpservermon/3.0.1; +http://www.phpservermonitor.org)""
",Mozilla/5.0 (compatible; phpservermon/3.0.1; +http://www.phpservermonitor.org),2014-05-04T06:35:49Z,Other,SGP
200,HTTP/1.1,GET /metastore/table/default/sample_07 HTTP/1.1,metastore,,00,SG,6ddf6e38-7b83-423c-8873-39842dca2dbb,Singapore,table,1.2931000000000097,GET,128.199.234.236,Other,1041,-,Singapore,,/metastore/table/default/sample_07,,103.85579999999999,Other,"demo.gethue.com:80 128.199.234.236 - - [04/May/2014:06:35:50 +0000] ""GET /metastore/table/default/sample_07 HTTP/1.1"" 200 1041 ""-"" ""Mozilla/5.0 (compatible; phpservermon/3.0.1; +http://www.phpservermonitor.org)""
",Mozilla/5.0 (compatible; phpservermon/3.0.1; +http://www.phpservermonitor.org),2014-05-04T06:35:50Z,Other,SGP
...............
}
Some of the parameters:
E.g. ?offset=0&length=204800&compression=none&mode=text
Specify a path of the file to download:
curl -X GET https://demo.gethue.com/api/v1/storage/download=/user/hue/weblogs.csv
curl -X GET https://demo.gethue.com/api/v1/storage/download=s3a://demo-gethue/data/web_logs/index_data.csv
Upload a local file to a remote destination directory:
curl -X POST https://demo.gethue.com/api/v1/storage/upload/file?dest=s3a://demo-gethue/web_log_data/ --form [email protected]
local_file
, this field is not related to HDFS.Create a directory at a specific path:
curl -X POST https://demo.gethue.com/api/v1/storage/mkdir
Create a file at a specific path:
curl -X POST https://demo.gethue.com/api/v1/storage/touch
Rename a file or directory:
curl -X POST https://demo.gethue.com/api/v1/storage/rename
Move a file or directory to a destination path:
curl -X POST https://demo.gethue.com/api/v1/storage/move
Copy a file or directory to a destination path:
curl -X POST https://demo.gethue.com/api/v1/storage/copy
Note: On the Apache Ozone filesystem, the copy operation returns a string of skipped files if their size is greater than the configured chunk size.
Fetch the content summary for a specific file on HDFS or Apache Ozone:
curl -X GET https://demo.gethue.com/api/v1/storage/content_summary=/user/hue/weblogs.csv
curl -X GET https://demo.gethue.com/api/v1/storage/content_summary=ofs://ozone1/testvolume/testbucket/testfile.csv
Delete a file or directory:
curl -X POST https://demo.gethue.com/api/v1/storage/rmtree
False
to move the file to the trash.Note: Currently, the skip_trash
field is only supported on HDFS.
Set the replication factor for a file on HDFS:
curl -X POST https://demo.gethue.com/api/v1/storage/set_replication
Restore a specific file or directory from trash on HDFS:
curl -X POST https://demo.gethue.com/api/v1/storage/trash/restore
Purge the trash directory on HDFS:
curl -X POST https://demo.gethue.com/api/v1/storage/trash/purge
The File Import API provides endpoints for uploading, analyzing, and previewing files that can be imported into various SQL engines. This API simplifies the process of creating database tables from files like CSV, TSV, and Excel spreadsheets.
The File Import API allows you to:
A typical workflow for importing a file into a database table involves these steps:
/api/importer/upload/file/
endpoint/api/importer/file/guess_metadata/
endpoint/api/importer/file/guess_header/
endpoint/api/importer/file/preview/
endpointUpload a file from your local system to the Hue server.
Endpoint: /api/importer/upload/file/
Method: POST
Content Type: multipart/form-data
Request Parameters:
Name | Type | Required | Description |
---|---|---|---|
file | File | Yes | The file to upload (csv, tsv, excel) |
Example using cURL:
curl -X POST \
-H "Authorization: Bearer <YOUR_JWT_TOKEN>" \
-F "file=@/path/to/sales_data.csv" \
https://demo.gethue.com/api/importer/upload/file/
Response:
{
"file_path": "/tmp/username_abc123_sales_data.csv"
}
Status Codes:
201 Created
- File was uploaded successfully400 Bad Request
- Invalid file format or size500 Internal Server Error
- Server-side errorRestrictions:
IMPORTER.MAX_LOCAL_FILE_SIZE_UPLOAD_LIMIT
IMPORTER.RESTRICT_LOCAL_FILE_EXTENSIONS
Analyze a file to determine its type and metadata properties such as delimiters for CSV files or sheet names for Excel files.
Endpoint: /api/importer/file/guess_metadata/
Method: GET
Request Parameters:
Name | Type | Required | Description |
---|---|---|---|
file_path | String | Yes | Full path to the file to analyze |
import_type | String | Yes | Type of import, either local or remote |
Example using cURL:
curl -X GET \
-H "Authorization: Bearer <YOUR_JWT_TOKEN>" \
"https://demo.gethue.com/api/importer/file/guess_metadata/?file_path=/tmp/username_abc123_sales_data.csv&import_type=local"
Response Examples:
For CSV files:
{
"type": "csv",
"field_separator": ",",
"quote_char": "\"",
"record_separator": "\n"
}
For Excel files:
{
"type": "excel",
"sheet_names": ["Sales 2024", "Sales 2025", "Analytics"]
}
Analyze a file to determine if it has a header row.
Endpoint: /api/importer/file/guess_header/
Method: GET
Request Parameters:
Name | Type | Required | Description |
---|---|---|---|
file_path | String | Yes | Full path to the file to analyze |
file_type | String | Yes | Type of file (csv , tsv , excel , delimiter_format ) |
import_type | String | Yes | Type of import, either local or remote |
sheet_name | String | No | Sheet name (required for Excel files) |
Example using cURL:
curl -X GET \
-H "Authorization: Bearer <YOUR_JWT_TOKEN>" \
"https://demo.gethue.com/api/importer/file/guess_header/?file_path=/tmp/username_abc123_sales_data.csv&file_type=csv&import_type=local"
Response:
{
"has_header": true
}
Generate a preview of a file's content with column type mapping for creating SQL tables.
Endpoint: /api/importer/file/preview/
Method: GET
Request Parameters:
Name | Type | Required | Description |
---|---|---|---|
file_path | String | Yes | Full path to the file to preview |
file_type | String | Yes | Type of file (csv , tsv , excel , delimiter_format ) |
import_type | String | Yes | Type of import (local or remote ) |
sql_dialect | String | Yes | SQL dialect for type mapping (hive , impala , trino , phoenix , sparksql ) |
has_header | Boolean | Yes | Whether the file has a header row |
sheet_name | String | No | Sheet name (required for Excel files) |
field_separator | String | No | Field separator character (defaults to , for CSV, \t for TSV, required for delimiter_format ) |
quote_char | String | No | Quote character (defaults to " ) |
record_separator | String | No | Record separator character (defaults to \n ) |
Example using cURL:
curl -X GET \
-H "Authorization: Bearer <YOUR_JWT_TOKEN>" \
"https://demo.gethue.com/api/importer/file/preview/?file_path=/tmp/username_abc123_sales_data.csv&file_type=csv&import_type=local&sql_dialect=hive&has_header=true"
# For a custom pipe-delimited file using delimiter_format
curl -X GET \
-H "Authorization: Bearer <YOUR_JWT_TOKEN>" \
"https://demo.gethue.com/api/importer/file/preview/?file_path=/tmp/username_abc123_pipe_data.txt&file_type=delimiter_format&import_type=local&sql_dialect=hive&has_header=true&field_separator=|"e_char=\"&record_separator=\n"
About delimiter_format
File Type:
The delimiter_format
file type should be used for custom delimited files that don't follow standard CSV or TSV formats. When using this file type:
field_separator
is required and must be explicitly specifiedquote_char
and record_separator
should be provided for proper parsingguess_metadata
response should be passed to ensure consistent parsingParameter Validation Notes:
sheet_name
is requireddelimiter_format
, always specify the required parametersrecord_separator
from the guess_metadata
responseResponse:
{
"type": "csv",
"columns": [
{
"name": "transaction_id",
"type": "INT"
},
{
"name": "product_name",
"type": "STRING"
},
{
"name": "price",
"type": "DOUBLE"
}
],
"preview_data": [
["1001", "Laptop XPS 13", "1299.99"],
["1002", "Wireless Headphones", "149.99"],
["1003", "Office Chair", "249.50"]
]
}
Get mapping from Polars data types to SQL types for a specific SQL dialect.
Endpoint: /api/importer/sql_type_mapping/
Method: GET
Request Parameters:
Name | Type | Required | Description |
---|---|---|---|
sql_dialect | String | Yes | SQL dialect for type mapping (hive , impala , trino , phoenix , sparksql ) |
Example using cURL:
curl -X GET \
-H "Authorization: Bearer <YOUR_JWT_TOKEN>" \
"https://demo.gethue.com/api/importer/sql_type_mapping/?sql_dialect=hive"
Response:
{
"Int8": "TINYINT",
"Int16": "SMALLINT",
"Int32": "INT",
"Int64": "BIGINT",
"UInt8": "TINYINT",
"UInt16": "SMALLINT",
"UInt32": "INT",
"UInt64": "BIGINT",
"Float32": "FLOAT",
"Float64": "DOUBLE",
"Boolean": "BOOLEAN",
"Utf8": "STRING",
"String": "STRING",
"Date": "DATE",
"Datetime": "TIMESTAMP"
}
Here's an example workflow that combines all the APIs to import a CSV file into a Hive table:
For the full code example and best practices, refer to the File Import documentation.
Get the list of configured connectors:
curl -X GET https://demo.gethue.com/api/v1/connector/instances
{"connectors": [{"category": "editor", "category_name": "Editor", "description": "", "values": []}, {"category": "browsers", "category_name": "Browsers", "description": "", "values": []}, {"category": "catalogs", "category_name": "Catalogs", "description": "", "values": []}, {"category": "optimizers", "category_name": "Optimizers", "description": "", "values": []}, {"category": "schedulers", "category_name": "Schedulers", "description": "", "values": []}, {"category": "plugins", "category_name": "Plugins", "description": "", "values": []}]}
curl -X GET https://demo.gethue.com/api/v1/connector/types
{ "connectors": [ { "category": "editor", "category_name": "Editor", "description": "", "values": [ { "dialect": "hive", "nice_name": "Hive", "description": "Recommended", "category": "editor", "interface": "hiveserver2", "settings": [ { "name": "server_host", "value": "localhost" }, { "name": "server_port", "value": 10000 }, { "name": "is_llap", "value": false }, { "name": "use_sasl", "value": true } ], "properties": { "is_sql": true, "sql_identifier_quote": "`", "sql_identifier_comment_single": "--", "has_catalog": false, "has_database": true, "has_table": true, "has_live_queries": false, "has_optimizer_risks": true, "has_optimizer_values": true, "has_auto_limit": false, "has_reference_language": true, "has_reference_functions": true, "has_use_statement": true } },
...........
{ "category": "browsers", "category_name": "Browsers", "description": "", "values": [ { "nice_name": "HDFS", "dialect": "hdfs", "interface": "rest", "settings": [ { "name": "server_url", "value": "http://localhost:50070/webhdfs/v1" }, { "name": "default_fs", "value": "fs_defaultfs=hdfs://localhost:8020" } ], "category": "browsers", "description": "", "properties": {} },
...........
{ "nice_name": "S3", "dialect": "s3", "settings": [], "category": "browsers", "description": "", "properties": {} }, { "nice_name": "ADLS", "dialect": "adls-v1", "settings": [], "category": "browsers", "description": "", "properties": {} } ] }, { "category": "catalogs", "category_name": "Catalogs", "description": "", "values": [ { "nice_name": "Hive Metastore", "dialect": "hms", "interface": "hiveserver2", "settings": [ { "name": "server_host", "value": "" }, { "name": "server_port", "value": "" } ], "category": "catalogs", "description": "", "properties": {} }, { "nice_name": "Atlas", "dialect": "atlas", "interface": "rest", "settings": [], "category": "catalogs", "description": "", "properties": {} },
...........
] }, { "category": "optimizers", "category_name": "Optimizers", "description": "", "values": [ { "nice_name": "Optimizer", "dialect": "optimizer", "settings": [], "category": "optimizers", "description": "", "properties": {} } ] }, { "category": "schedulers", "category_name": "Schedulers", "description": "", "values": [ { "nice_name": "Oozie", "dialect": "oozie", "settings": [], "category": "schedulers", "description": "", "properties": {} },
...........
] }, { "category": "plugins", "category_name": "Plugins", "description": "", "values": [] } ], "categories": [ { "name": "Editor", "type": "editor", "description": "" }, { "name": "Browsers", "type": "browsers", "description": "" }, { "name": "Catalogs", "type": "catalogs", "description": "" }, { "name": "Optimizers", "type": "optimizers", "description": "" }, { "name": "Schedulers", "type": "schedulers", "description": "" }, { "name": "Plugins", "type": "plugins", "description": "" } ] }
First step is to get the config of the connector we want to instantiate. In input we pick a type of connector from the list of above types by specifying its dialect
and interface
names.
curl -X POST https://demo.gethue.com/api/v1/connector/instance/new/<DIALECT>/<INTERFACE>
And get back a template that we send to the /update call:
curl -X POST https://demo.gethue.com/api/v1/connector/instance/new/hive/sqlalchemy -d 'connector={"nice_name":"Hive Docker Local","name":"41","dialect":"hive","interface":"hiveserver2","settings":[{"name":"server_host","value":"localhost"},{"name":"server_port","value":10000},{"name":"is_llap","value":false},{"name":"use_sasl","value":"true"}],"category":"editor","description":"Recommended","dialect_properties":{"is_sql":true,"sql_identifier_quote":"`","sql_identifier_comment_single":"--","has_catalog":false,"has_database":true,"has_table":true,"has_live_queries":false,"has_optimizer_risks":true,"has_optimizer_values":true,"has_auto_limit":false,"has_reference_language":true,"has_reference_functions":true,"has_use_statement":true}}'
curl -X GET https://demo.gethue.com/api/v1/connector/instance/get/<ID>
This is the same as creating a new connector instance, but as we provide the id
we will update the existing instance:
curl -X POST https://demo.gethue.com/api/v1/connector/instance/update -d 'connector={"nice_name":"Hive Docker Local","name":"41","dialect":"hive","interface":"hiveserver2","settings":[{"name":"server_host","value":"localhost"},{"name":"server_port","value":10000},{"name":"is_llap","value":false},{"name":"use_sasl","value":"true"}],"id":"41","category":"editor","description":"Recommended","dialect_properties":{"is_sql":true,"sql_identifier_quote":"`","sql_identifier_comment_single":"--","has_catalog":false,"has_database":true,"has_table":true,"has_live_queries":false,"has_optimizer_risks":true,"has_optimizer_values":true,"has_auto_limit":false,"has_reference_language":true,"has_reference_functions":true,"has_use_statement":true}}'
curl -X POST https://demo.gethue.com/api/v1/connector/instance/delete -d 'connector={"id": "1"}'
Check if the connectivity is healthy:
curl -X POST https://demo.gethue.com/api/v1/connector/instance/test/ -d 'connector={"nice_name":"Hive Docker Local","name":"41","dialect":"hive","interface":"hiveserver2","settings":[{"name":"server_host","value":"localhost"},{"name":"server_port","value":10000},{"name":"is_llap","value":false},{"name":"use_sasl","value":"true"}],"id":"41","category":"editor","description":"Recommended","dialect_properties":{"is_sql":true,"sql_identifier_quote":"`","sql_identifier_comment_single":"--","has_catalog":false,"has_database":true,"has_table":true,"has_live_queries":false,"has_optimizer_risks":true,"has_optimizer_values":true,"has_auto_limit":false,"has_reference_language":true,"has_reference_functions":true,"has_use_statement":true}}'
Install or update the connector examples:
curl -X POST https://demo.gethue.com/api/v1/connector/examples/install/
Get user records in Hue. Requires admin privileges.
curl -X GET https://demo.gethue.com/api/v1/iam/get_users
Optional GET params:
E.g. ?username=demo&groups=default&is_active=true
Search user records by list of user IDs. Requires admin privileges.
curl -X GET https://demo.gethue.com/api/v1/iam/users?userids=[1100714,1100715]
{"users": [{"id": 1100714,"username": "demo","first_name": "","last_name": "","email": "","last_login": "2021-10-06T01:36:49.663","editURL": "/useradmin/users/edit/demo"},{"id": 1100715,"username": "hue","first_name": "","last_name": "","email": "","last_login": "2021-08-11T07:15:48.793","editURL": "/useradmin/users/edit/hue"}]}
User list_for_autocomplete API:
curl -X GET https://demo.gethue.com/api/v1/iam/users/autocomplete
Optional GET params:
The metadata API is powering the external Catalog integrations.
curl -X POST https://demo.gethue.com/api/v1/metadata/search/entities_interactive/ -d 'query_s="*sample"&sources=["documents", "sql", "hdfs", "s3"]'
Some of the parameters:
["documents", "sql", "hdfs", "s3"]
['type', 'owner', 'tags', 'lastModified']
Searching for entities with the dummy
catalog:
curl -X POST https://demo.gethue.com/api/v1/metadata/search/entities_interactive/ -d 'query_s="*sample"&interface="dummy"'