Shuzhan Fan
Sharing is caring.
-
Geo-tagged tweets collection using Twitter Streaming API and databaseOne research I’m working on is to use Twitter data to predict crime patterns. So, the first thing I need to do is to collect Twitter data. Specifically, since I’m interested in discovering the spat...
-
Machine Learning Classification Model Evaluation MetricsAfter training the machine learning classification model, we should always evaluate the model to determine if it does a good job of predicting the target value on new unseen data. Among the various...
-
Running Jupyter Notebook with Apache Spark on Google Cloud Compute EngineApache Spark is a powerful open-source cluster-computing framework. Compared to Apache Hadoop, especially Hadoop MapReduce, Spark has advantages such as speed, generality, ease of use, and interact...
-
How to install and set up MySQL on MacMySQL is probably the most popular open source SQL relational database. Unfortunately, MacOS doesn’t ship with MySQL. I still remember when I took my first database class years ago, the professor h...
-
Using Python subprocess for parallel processingUnlike Javascript, which is naturally asynchronous, Python interpreter executes codes in a sequential order. The subsequent jobs have to wait until the completeness of the previous ones. This behav...
-
My first blogThis is my first blog, EVER! I’ve always been thinking of writing something about the work I do, sharing the knowledge I know, and of course, learning new stuff in turn. Now finally I made my decis...