A jump start to data/sentiment analysis using the Twitter API, Sockets.io, and a NodeJS application

To follow along with the complete code, you can clone or download the project from my profile at Github.

Social media postings can provide us meaningful perspectives about how we see the world. From London to Bali, we’ve replaced sending postcards with an Instagram selfie, articulated our thoughts about an election on Facebook, or Tweeted out news instead of waiting for it on a newspaper. Conveying our thoughts and other things about ourselves used to take days or months. Beyond providing perspectives, we could now express ourselves anywhere, anytime with an internet connection. 

In the wake of the ongoing Facebook data debacle, we’ve been made more aware of how our data has been used, and who gets to see it. As the ethical question is a big concern, we have to face a fact that the efficiency of the web has also made data gathering easier as well, allowing companies to profile us in ways that make it seem they know us more than we know ourselves. If you’ve ever seen an ad follow you on the web, or get special rates when you shop online, that’s more likely based on data the company has regarding your online actions. With that said, we too could harness the power of web data if we decide to create our own apps for our own startups, businesses, or projects to better serve our customers, create sentiment analysis, or predict future outcomes. 

For this tutorial, we’ll make a simple server app which will gather the latest Tweets about startups and app ideas. This could be useful if you’re the developer looking for side projects, or a budding entrepreneur looking for an opportunity. This will also be my first tutorial written using Javascript as the main server-side programming language as we’ll be coding in NodeJS. We’ll be storing the Tweet information on MongoDB. 


To get started on this project, you’ll need knowledge and the following installed on your computer:

a. NodeJS

b. MongoDB 

c. Twitter Developer API Key – Get one at https://developer.twitter.com/

d. (Optional) MongoDB GUI reader such as Robo 3T

e. (Optional) Nodemon - Highly recomended Node package that will detect changes in your server and restart the app - allowing you to save time while working on Node projects!

Project Structure

Once completed, the project will get the latest tweets in real time using Twitter’s Stream API. This would allow us to see Tweets about startups and app ideas on a template as they are added, which will be fun when you see it just magically appear on the screen. We’ll initiate the stream using our Twitter API credentials, which we’ll then take the individual tweet data and store it to our MongoDB database which is more suitable for handling different requests at the same time. 

For our database, we’ll create a file and run a command which will be familiar to a database migration for those who have worked with web development frameworks like Django, Ruby on Rails, or Laravel. Because MongoDB stores data with a document-oriented approach like a JSON-like document,  we’re going to be inserting data using a format like: { the object key, object value }. Unlike a relational database like MySQL, you can add new values for an object within a collection without affecting the others. For now, we’ll be storing a tweet’s ID, the text, the author’s name and image, and the date written. By default, MongoDB will define an _id for us as a key to call the document in a query should you decide to expand the app further. 

We’ll be using the following NodeJS packages:

a. Express – Our main application framework 

b. MongoDB – The official MongoDB NodeJS driver

c. Twit – A package created to access Twitter’s API via NodeJS 

d. Socket.io – One of the awesome packages for Node developers, which offers us asynchronous capabilities allowing us to create real-time applications

Set Up

To get started, open your terminal and create a new folder within your desktop or folder where you want to work with mkdir twitterfeed. Then write cd twitterfeed to enter your newly created project folder, and set up the Node project with npm init. You’ll be prompted to enter new personal preferences for the app, which you can keep all the defaults for now. At this point you should have a package.json file which mentions our main file as “index.js.” You can also run the following command to install the necessary project packages: npm install -- save express socket.io twit mongodb.

For our database and collection creation, ensure that you have your MongoDB instance up and running with a new terminal window with the command mongo. Then, create a new file within your project called db.js. We’ll first call the MongoDB package, use it to create a new Mongo client, and establish our connection. With that connection, we’ll then run a command line prompt to create our database and the collection.  With less than 20 lines of code, your db.js file should look like: 

//To create this database and collection, run "node db.js". //If you want to use a different database or collection name, //you are more than welcome to change it. var MongoClient = require('mongodb').MongoClient; var url = "mongodb://localhost:27017/"; MongoClient.connect(url, function(err, db) {   if (err) throw err;   //The database name   var dbo = db.db("twitterfeed");   //Our Collection name   dbo.createCollection("tweets", function(err, res) {     if (err) throw err;     console.log("Collection created!");     db.close();   }); });

After working on the db.js file, you can now run node db.js. You can check if it works by entering your MongoDB database on command line or using Robo_3T to see a new database “twitterfeed” and a collection named “tweets.” Before proceeding further, do a check in your packages.json file that all the installed Node packages are listed. 


Now that we’re done with working on the database, we can now work on the intended functions of our application. On your index.js, we’ll call all the necessary packages, create a new server for our app to work, and connect to our newly created MongoDB database. We will define our keywords to search for on Twitter and our API credentials before establishing our Socket.io connection. Within our Socket.io connection, we retrieve the stream taking data from each Tweet to store into the database. 

With less than 70 lines of code, it should now look like the following (ensure you replace the Twitter credentials with your own): 

var express = require('express'); var app = express(); var http = require('http'); var server = http.createServer(app); var mongo = require('mongodb'); var Twit = require('twit'); var io = require('socket.io').listen(server);   var MongoClient = require('mongodb').MongoClient; var url = "mongodb://localhost:27017/"; server.listen(3000, function() {   console.log("The server is running."); }); // You won't need the index.html. THis is built so you can see the Tweet stream as it updates //thanks to Socket.io app.get('/', function (req, res) {     res.sendFile(__dirname + '/index.html'); }); // Your keywords to search within the Tweet Stream var watchlist = ['startup ideas, problems, i want an app, new app, startup']; // Twitter API credentials var T = new Twit({     consumer_key:'[ your Twitter API Consumer key ]',     consumer_secret:'[ your Twitter API Consumer secret ]',     access_token:'[ your Twitter API access token ]',     access_token_secret:'[ your Twitter Access token secret ]', }); //Sockets.io connection io.sockets.on('connection', function (socket) {     var stream = T.stream('statuses/filter', { track: watchlist });     stream.on('tweet', function (tweet) {     // when a new Tweet pops into the stream, we get some data from the Tweet object. More information     // about the object keys you can use can be found at https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object         io.sockets.emit('stream', tweet.user.profile_image_url + ","             + tweet.created_at + "," + tweet.id + "," + tweet.text             + ", @" + tweet.user.screen_name);           //enter in ability to enter in text to database         MongoClient.connect(url, function(err, db) {           if (err) throw err;           var dbo = db.db("twitterfeed");           var myobj = {                 tweet_id: tweet.id,                  tweet: tweet.text,               twitter_handle_image: tweet.user.profile_image_url,               twitter_handle: tweet.user.screen_name,               created_at: tweet.created_at,            };           dbo.collection("tweets").insertOne(myobj, function(err, res) {             if (err) throw err;             console.log("1 document inserted");             db.close();           });         });     });   }); 

While it may not be needed, for this project we’ll use the index.html to view the Twitter stream to ensure the app is working. It will call Socket.io on the client side and retrieve the Twitter stream, displaying it within div id= “tweetd”. Your index.html file should look like this:

<!DOCTYPE html> <html> <head>   <title>Twitter Feed Fun</title> </head> <body>  <script src="/socket.io/socket.io.js"></script>  <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.js"></script>  <script>         var socket = io.connect('http://localhost:3000');         socket.on('stream', function(tweet){         $('#tweetd').append(tweet+'<br>');         });   </script>   <div id="tweetd"></div> </div> </body> </html>

With all set, you can run the app using the command node index.js or nodemon if you have that installed on your system. If it works properly, you should start seeing new entries to your database. 

If you decide to expand this further, you can use data collected from this app for sentiment analysis or to create an API for another project to display these Tweets. Other applications you can do can be sending out automated messages when a Tweet contains certain words of interest. You can also follow your favorite company and get updates when they Tweet, which would be suitable for bloggers and writers looking to break the next big business or industry story. 

Contact Form