An API advice guide to making the most of Open Measures’ streaming public data (Telegram, Parler, 8kun, etc.)
Most know us for the open-source free online visual tool, Open Measures for data analysis. However, we also have an incredible API full of stories waiting to be found for those with more technical needs. The API is ideal for not just researchers and activists, but also journalists, as covered here in our last Substack. Here’s an example Colab/Jupyter notebook to get started. Check out this Open Measures SDK as well:
Hey, folks. If any of you are using/interested in using @SMAT_app, I’ve made a python library/CLI to get to the data easily. You can check it out here: gitlab.com/dhosterman/sma…. Or, you can just pip install smat-cli and go to town.
gitlab.comDaniel Hosterman / smat-cli · GitLabProvides command line tools for getting data from the Social Media Analysis Toolkit (SMAT) as well as a library for interacting with SMAT from your own code.1:40 PM ∙ Mar 29, 20228Likes4Retweets
Some of the sites you can query with data services from our API are:
- Telegram
- Gab
- Parler
- 8kun
- 4chan
- thedonald./win
- Poal
- Etc.
The API can be found at api.smat-app.com/docs directly.
The API has access to the raw JSON behind all of our front-end tools and can be useful for developers and analysts who want to dive deeper into the data or make more fine-grained queries. The API has three endpoints and has an optional boolean logic query for the Content endpoint. If all this sounds too complex don’t worry, we make it easy with the interactive documentation.
Example Content Endpoint Query
First, click the Content endpoint and then click “Try it out”.
Choose Telegram as the Site, and then click Execute. It will give you a regular URL link that will have the raw JSON content of your query. The interface will look like this:
If you Curl or go to the URL you will get a raw JSON that will look like the following (certain browsers like Firefox will automatically “prettify” the JSON to make it easier to read):
Example of the same request using curl from the command line and jq to pretty print:
Everything is nested in “hits” and “_source” for all of our data as you can see but the key fields for Telegram for most use cases are:
- channeltitle
- channelusername
- message
- views (note: views count is dependent on when it was crawled)
- date
- postauthor
Note: All datasets are structured slightly differently than each other. You have to analyze the structure of the JSON individually.
Contact
If you need more data or more advanced interfaces for analysis please reach out to us at info@openmeasures.io
Identify disinformation and extremism with the Open Measures platform.
Organizations use Open Measures’ tooling every day to track trends related to networks of influence, coordinated harassment campaigns, and state-backed info ops. Click here to book a demo.