Zingg On Databricks #449
-
I have tried running the FebrlExample.py and the ncVoters.py examples on Databricks in azure. In both cases I get an error when excuting: zingg.initAndExecute(). In both cases, I ensured the paths were correct and that all objects (args, options, pipes) were instantiated. I am fairly new to Databricks and python so any help that you can give me to pinot me in the right direction would be appreciated. Environment Databricks Runtime Version:
|
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 7 replies
-
Thanks for trying Zingg @eric6204. Can you try changing ZinggWithSpark to Zingg? |
Beta Was this translation helpful? Give feedback.
-
Ok let me check this and get back, thanks for reporting |
Beta Was this translation helpful? Give feedback.
-
Also, since you are calling the trainmatch job, do you already have the febrl model loaded on the dbfs? Can you please share your code and locations of csv and model folders in dbfs with their contents? |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
Thanks for this. I am able to run the findTrainingData phase of FebrlExample.py using ZinggWithSpark instead of Zingg in the example code. The reason it hasnt worked for you is that the example has phase trainMatch, which needs the model folder to be located in dbfs. If you can please 1. Change example to use ZinggWithSpark(you were right, this is the class to use with Databricks :D))
2. Upload models/100 to a location of your choice
3. Set zinggDir to adjust to the location of the models folder uploaded above
4. Run