问题:使用pyhive连接后使用pandas会预警,让使用sqlalchemy
方法:
from sqlalchemy.engine import create_engine
from .config import config
database_config = {
'host': config['HV_SVR'],
'port': 10000,
'auth': "CUSTOM",
'database': config['HV_DB'],
'user': config['HV_USER'],
'password': config['HV_PASS']
}
database_connect = (
"hive://{user}:{password}@{host}:{port}/{database}?auth={auth}".format(
**database_config)
)
engine = create_engine(
database_connect)
hive_conn = engine.connect()
df = pandas.read_sql(sql=sql, con=hive_conn)
print(df)
注意:使用pandas==2.2.2版本会出错:
con=engion时报错:AttributeError: 'Engine' object has no attribute 'cursor'
con=hive_conn时报错:AttributeError: 'Connection' object has no attribute 'cursor'
使用版本为sqlalchemy==1.4.50 thrift==0.20.0 thrift-sasl==0.4.3 pyhive==0.7.0 pandas==2.1.4
参考:
https://stackoverflow.com/questions/55314977/pandas-read-sql-attributeerror-engine-object-has-no-attribute-cursor
https://blog.csdn.net/qq_40304090/article/details/108263224
https://pypi.org/project/PyHive/