October 10, 2021

Advanced API Usage

Using author and boolean searching using Open Measures API allows users to further fine tune and target their searches.

The Open Measures API is able to be fine tuned to your exact needs. To show this, we will spell out the steps necessary to pull up to 10k of Guo Wengui’s posts on Gettr as a reference to our recent post outlining some of the sketchy past and present of Gettr.

This post builds upon the information shown in our original API guidance blog post. The key element of this advanced usage is using the term query with Elasticsearch query string syntax while setting the es_query field to True.

Much of this can be found in our example Colab/Jupyter notebook as well.

Getting Started

After heading over to our interactive API docs click the content button:

api content button
Content button in Open Measures’ interactive API docs tool.
  1. Click “Try it out.” 
  2. Next to “term” write any interesting word for now. 
  3. On “site” select “Gettr.” 
  4. Leave all the other settings default for now and click “Execute.” 
  5. This will generate a “Request URL” if you copy that link into a new browser window you will be offered a JSON of the data you requested. 

NOTE: JSON is just a term for a type of data format commonly used on the web. It contains nested “keys and values”. One way to think about it would be in a workplace table you would have a few classes called keys such as “employee name” or “employee position” that would each have a unique value. They can then be nested in something like the larger department or city they work in. For our data, the JSON has many different fields containing different aspects of the data such as the username, the post itself, the time posted, and other details. We recommend using a browser like Firefox because it auto-formats the JSON for you. We present the JSON as close to as exact as it was represented on the native site the data was crawled from.

Now that you have some examples of the format of the data you want to explore, dig through it to find the field (or “key”) you want to search under.

In our case, we are interested in a field under “uinf” called “username” because we are doing author search. At the moment, Open Measures doesn’t have data field descriptions, and so the best way for finding the intended field is to look through the JSON results from this /content query.

json with username field
Prettified JSON blob highlighting the username field.

User Search

Now that we know what field corresponds to the username in the JSON, we are ready to search for the posts written by a specific user on Gettr,  

Back into the interactive API, we can now construct our input to the term field in the API. We combine uinf.username with the specific username, in this case “miles”, we are interested in searching using the following syntax: “uinf.username : miles”.

NOTE: For those wishing to learn more about the query language behind these requests check out this documentation.

Then we can configure the remaining Open Measures API arguments:

Once your fields look like the following, click execute and copy the URL again. It may take a second to load!

point and click open measures api interface
Open Measures’ interactive point and click API interface.

Once you have the JSON opened in a new tab (here’s a direct link to the query we discussed!), you may have to click to expand some of the fields. Most of what you’re interested in here will be under:

hits > a number > _source.

Once there you will see the contents of the message as the field named “txt” as well as other information.

json with txt field
Prettified JSON blog highlighting the txt (or comment body) field.

Quick Start Guide

Here is a link to a Quick Start Code Guide in Colab or Jupyter notebook format for making requests to our API.

As well, here are a few field names to get you up and running but there are many more interesting fields in each dataset such as likes, external links, even language, and user-chosen locations in some datasets.

username code table for open measures api
Table of key field names for various sites.

Conclusion

Once you’ve got the hang of searches for all of an author’s posts, you can experiment with other advanced queries over any of the other fields in any of our data sources such as language, location, links, etc. As always, let us know on Twitter or elsewhere what you find!


Identify disinformation and extremism with the Open Measures platform.

Organizations use Open Measures’ tooling every day to track trends related to networks of influence, coordinated harassment campaigns, and state-backed info ops. Click here to book a demo.