John F Kidd on Wed, 17 May 2017 20:07:50
When we use the ODBC drivers for HIVE in Power BI we see all the Hive tables and views that the user has access to. However, in Azure Data Catalogue we do not see any tables or views. Is this a known issue?
John F Kidd
Troy Yin on Mon, 22 May 2017 03:48:19
It's definitely not the expected behavior, we need to install the Hive ODBC driver by ourselves, then use Azure Data Catalog to connect to the hive data source.
Can I ask you some question to help you clarify the issue?
1. Do you use the same machine to connect to hive with Power BI and Azure Data Catalog? If not, you need to install the Hive ODBC driver first on the machine you want to use Azure Data Catalog, then try to connect to hive by using ODBC Data Source Administrator to verify whether the driver installation is correct, then you can go to Azure Data Catalog to try again.
2. What exact the hive data source you was trying to connect, and what's the version of driver you installed?
John F Kidd on Tue, 23 May 2017 15:28:50
Thanks for getting back to me.
- We are using the Microsoft Hive OBDC drivers downloaded form the website.
- We are using the same machine and the same drivers for Power BI and ADC.
Like I say we seem to connect but cannot see any tables. We are using Kerberos authentication.
John F Kidd on Tue, 23 May 2017 15:50:21
It is also worth mentioning that the odbc database page of the desktop app asks for:
- Connection String
This seems strange as we should be able to select a system dsn as we do with Power BI.
John F Kidd on Tue, 23 May 2017 15:55:39
And finally, to be clear, we are an on-premise HDP 2.5 cluster. Does Azure Data Catalog only work with HDInsight?
Troy Yin on Wed, 24 May 2017 07:26:58
In odbc database page of registration tool, so far it doesn't support dsn connection, and you need to input the information in dsn file to this page.
Troy Yin on Wed, 24 May 2017 08:14:47
I have tested with HDP 2.5 cluster (I created a VM on Azure and setup a HDP 2.5 cluster), it works for me in my local test.
Here is what I did:
1. Setup a VM on Azure and host a HDP 2.5 cluster.
2. Install the latest Microsoft Hive ODBC driver, here is the link: https://www.microsoft.com/en-us/download/details.aspx?id=40886
3. In the VM network settings, open 10000 port in network security group.
4. Launch registration tool, and in hive page, input
- Server (it's the public ip address in my case),
- Port (it's 10000 in my case because I open 10000 port number in network security group),
- Authentication Type, choose "Basic"
- User Name and Password.
5. Then click connect, I can see there are three database in server hierarchy, "default", "foodmart" and "xademo".
6. Then I click "default", it will show two table in Available objects, "sample_07" and "sample_08".
7. Then I choose both of them, click "Include Preview" and "Include Data Profile", and add an expert as my email address. Then click Register.
Not sure whether you miss any step of them, if so, please follow the above steps to take one more try.
John F Kidd on Thu, 08 Jun 2017 15:12:23
Our instance is Kerborized and uses Active Directory. I followed the steps above (and already tried this) and it gives an authentication error.
I think you need to have the same environment as me to test this properly.
Troy Yin on Mon, 12 Jun 2017 07:51:46
Since so far we don't support DSN in Registration Tool, which is the way we use to connect to HDP via Power BI. And at the same time, there is no Kerberos auth option in Hive page of Registration Tool, I tested your scenario and it also returned authentication error from Registration Tool. According to the implement, Kerberos is not supported in Registration Tool so far.
John F Kidd on Fri, 16 Jun 2017 10:47:28
Thanks for getting back to us. We were really banking on rolling out this product to manage our business data catalog for our data lake. We are currently Hadoop / HDP onsite and looking to use HDInsight as a joint cloud and onsite cluster. Also, the HDInsight instance will also be secured using AD.
As this won't happen in the near term, are there any plans to fix the issue above issue and provide DSN and AD so that we have the same behaviour as Power BI?
If not we will need to look elsewhere.
Troy Yin on Mon, 19 Jun 2017 06:50:52
Since so far the Registration Tool doesn't support DSN connection in Hive, it should be a feature gap and we will add them into our development plan.
Before we estimate the development plan, there is no ETA yet. I'm very sorry for the imperfect user experiences.
John F Kidd on Mon, 19 Jun 2017 16:27:59
Thanks for letting us know Troy. I guess for now we will have to look at another solution. Please keep me posted with any new features that will solve our problem.
Daniel Overdevest on Mon, 29 Jul 2019 13:35:54
Hi Troy, any update on this? Can't find the documentation of Data Catalog to fill in the ODBC Database or Hive Database. I've some issues configuring the HTTP path. We're using Databricks in Azure and it's not clear what settings to configure.
HimanshuSinha-msft on Wed, 31 Jul 2019 22:14:57
Hello Daniel ,
We will reach out to the internal team and update you once we hear back from them .
HimanshuSinha-msft on Thu, 01 Aug 2019 16:52:05
Just got a reply from the Azure catalog team that the ODBC driver is not set up to work with Databricks and we do not have a resolution. for that at this time .
Arek_D on Sat, 28 Dec 2019 10:18:43
Something new in this subject?
We are looking for the Data Governance tool. We have a lot of tables in Hive. We need to decide, shortly, which tool to use.
Is the Data Catalog is worth our work or not?