Crawl tweets by hashtag using golang

This tutorial shows how to download videos and images from twitter that are associated with a particular hashtag.

Twitter is a social media platform.

A hashtag is a type of metadata tag used on social networks such as Twitter and other microblogging services, allowing users to apply dynamic, user-generated tagging which makes it possible for others to easily find messages with a specific theme or content

The tutorial also is meant for go1.11 and above as it requires go mod, if you have a lesser version, you can ignore go mod and write your program within GOPATH.

The twitter API provides a search function that allows us to search using particular keywords, hashtag, filters and other configuration options.
The documentation to the search api can be found here.

Let’s get to work!

Create a folder for your project and create a main.go file.

package main

import (
    "fmt"
    "github.com/dghubble/go-twitter/twitter"
    "github.com/dghubble/oauth1"
    "log"
)

func main() {}

Our main.go file will contain the code above. The second import is a go library that helps us to interact with the twitter api, the third is an oauth library so we can authenticate with the twitter api. Before we continue, lets us get authentication credentials from twitter for us to use in our program. We can get authentication by first creating a twitter app here
After creating the app, credentials are found under ‘keys and tokens’ tab for the app.

You can go get the dependencies above if you do not have go1.11+. If you have go1.11, you can do ‘go mod init’ in your working directory.

Now, we need to authenticate with the twitter api and create a client

func main() {
    config := oauth1.NewConfig("consumer-key", "consumer-secret")
    token := oauth1.NewToken("access-token", "access-secret")
    httpClient := config.Client(oauth1.NoContext, token)
    client := twitter.NewClient(httpClient)
}

Replace credentials above with the ones for your app as found in ‘keys and token’

We can now use the search function of the client to search for tweets

func main() {
    config := oauth1.NewConfig("consumer-key", "consumer-secret")
    token := oauth1.NewToken("access-token", "access-secret")
    httpClient := config.Client(oauth1.NoContext, token)
    client := twitter.NewClient(httpClient)
    searchResults, _, err := client.Search.Tweets(&twitter.SearchTweetParams{
        Query: "#memes filter:images -filter:nativeretweets",
        Count:           40,
    })
    if err != nil {
        log.Fatalf("Could not search tweet %s", err)
    }
}

The Search.Tweets function returns a pointer to a Search struct, a pointer to a http.Response struct and an error. Since we do not need the http.Response struct in this case, we set the variable at that point to _.
We then check for an error and handle it apprioprately.

In the &twitter.SearchTweetParams struct, the Query property is where we enter the query. You can learn more on how to build a search query by going to this link.
In our query, #hashtag means we want tweets with that have the #memes hashtag, count is the number of tweets to return per page.

searchResults variable is a slice, which means we can loop through it and probably download the images in each tweet.

func main() {
    config := oauth1.NewConfig("consumer-key", "consumer-secret")
    token := oauth1.NewToken("access-token", "access-secret")
    httpClient := config.Client(oauth1.NoContext, token)
    client := twitter.NewClient(httpClient)
    searchResults, _, err := client.Search.Tweets(&twitter.SearchTweetParams{
        Query: "#memes filter:images -filter:nativeretweets",
        Count:           40,
    })

    if err != nil {
        log.Fatalf("Could not search tweet %s", err)
    }

    for _, tweet := range searchResults.Statuses {
        if tweet.ExtendedEntities == nil {
            continue
        }
	 media := tweet.ExtendedEntities.Media
	 for _, m := range media {
	     print(m.MediaURL)
	 }
    }
}

What then can we do, we can download the images from that url and save them. To do that, we need this package https://github.com/cavaliercoder/grab. Our imports should now look like this

import (
    "fmt"
    "github.com/dghubble/go-twitter/twitter"
    "github.com/dghubble/oauth1"
    "github.com/cavaliercoder/grab"
)

We can now use the Get function of the package to download the image just like the snippet shown below

resp, err := grab.Get(".", "http://www.golang-book.com/public/pdf/gobook.pdf")
if err != nil {
    log.Fatal(err)
}

Our code should now look like below

func main() {
    config := oauth1.NewConfig("consumer-key", "consumer-secret")
    token := oauth1.NewToken("access-token", "access-secret")
    httpClient := config.Client(oauth1.NoContext, token)
    client := twitter.NewClient(httpClient)
    searchResults, _, err := client.Search.Tweets(&twitter.SearchTweetParams{
        Query: "#memes filter:images -filter:nativeretweets",
        Count:           40,
    })

    if err != nil {
        log.Fatalf("Could not search tweet %s", err)
    }

    for _, tweet := range searchResults.Statuses {
        if tweet.ExtendedEntities == nil {
            continue
        }
	 media := tweet.ExtendedEntities.Media
	 for _, m := range media {
	     _, err := grab.Get(".", m.MediaURL)
            if err != nil {
                log.Fatal(err)
            }
	 }
    }
}

We can now go build and run our program. Having difficulties, drop a comment below

Leave a Reply

Your email address will not be published. Required fields are marked *