That is just the tip of the API iceberg. When you make an API call for the same Tweet, you get a look at the “full” schema for this tweet:
statuses: - created_at: Fri Jul 05 21:18:55 +0000 2019 id: 1147253546410098700 id_str: '1147253546410098688' text: 'Leftover mashup from the last two days: Kimchi Meatloaf, Fried chicken tenders, macaroni salad, and biscuit — all h… https://t.co/o2ZE2DJf8F' truncated: true entities: hashtags:  symbols:  user_mentions:  urls: - url: https://t.co/o2ZE2DJf8F expanded_url: https://twitter.com/i/web/status/1147253546410098688 display_url: twitter.com/i/web/status/1… indices: - 117 - 140 metadata: iso_language_code: en result_type: recent source: <a href="https://mobile.twitter.com" rel="nofollow">Twitter Web App</a> in_reply_to_status_id: in_reply_to_status_id_str: in_reply_to_user_id: in_reply_to_user_id_str: in_reply_to_screen_name: user: id: 5954192 id_str: '5954192' name: Kin Lane screen_name: kinlane location: Seattle, WA description: I am a writer. I like telling stories. I like playing with images. I also do things with the Internet as the API architect at F5 Networks. url: https://t.co/8qR9jIDiSG entities: url: urls: - url: https://t.co/8qR9jIDiSG expanded_url: http://kinlane.com display_url: kinlane.com indices: - 0 - 23 description: urls:  protected: false followers_count: 14354 friends_count: 8721 listed_count: 677 created_at: Fri May 11 08:17:59 +0000 2007 favourites_count: 11289 utc_offset: time_zone: geo_enabled: true verified: false statuses_count: 4190 lang: contributors_enabled: false is_translator: false is_translation_enabled: false profile_background_color: 1A1B1F profile_background_image_url: http://abs.twimg.com/images/themes/theme9/bg.gif profile_background_image_url_https: https://abs.twimg.com/images/themes/theme9/bg.gif profile_background_tile: false profile_image_url: http://pbs.twimg.com/profile_images/1139394614643658754/X-PvqUTV_normal.png profile_image_url_https: https://pbs.twimg.com/profile_images/1139394614643658754/X-PvqUTV_normal.png profile_banner_url: https://pbs.twimg.com/profile_banners/5954192/1560487874 profile_link_color: ABB8C2 profile_sidebar_border_color: FFFFFF profile_sidebar_fill_color: '252429' profile_text_color: '666666' profile_use_background_image: true has_extended_profile: true default_profile: false default_profile_image: false following: follow_request_sent: notifications: translator_type: none geo: coordinates: place: contributors: is_quote_status: false retweet_count: 0 favorite_count: 5 favorited: false retweeted: false possibly_sensitive: false lang: en search_metadata: completed_in: 0.048 max_id: 1147439612383678500 max_id_str: '1147439612383678465' next_results: "?max_id=1144723262863269887&q=kinlane&count=100&include_entities=1" query: kinlane refresh_url: "?since_id=1147439612383678465&q=kinlane&include_entities=1" count: 100 since_id: 0 since_id_str: '0'
I do not have my geographic settings turned on, so I’m not recording those data points. However, there is a still a lot of schema behind the scenes of this single tweet that contains relevant data points about my world, providing context to the tweet, but also widening the behavioral snapshot of my personal world. Anyone can pull this data using the Twitter API. All you need is an API key, and the id of a user, and you can obtain this “full” view of what schema exists behind each and every tweet. There is a lot of data the average user doesn’t see, or is burdened with. This layer of the web is primarily the domain of developers, but that is just one of several dimensions—-there is also the provider view, or in this case, what Twitter sees.
The schema I posted is just one portion of the schema you see behind each Tweet. Twitter possess even more data than this, providing access to logging, cookie, and other dimensions that take things much further than what is listed here. This is what we see going over the API pipes between a desktop, web, or mobile application and the Twitter mothership, but there is a whole other set of data behind Twitter’s API servers. I consider there to be four layers of schema for defining and providing access to our regular social behavioral patterns:
These four layers of schema are made available to four very different actors. Depending on who you are, you have a different view of the schema. There are also two other key variables that come into play when discussing who can see different parts of the schema, and who is entitled to even know about, let alone have any decision making capability around the data stored within the platform schema.
The picture I’m painting here isn’t about data ownership or specifically data privacy. I’d say it is more about data awareness, and data voice—having a say in what happens to your data. Most platform, API, and application providers feel free to generate, store, process, and own data about you, generated on your devices, within your home, just because it was facilitated with their software. I am trying to connect all the dots when it comes to the data we generate each day, but work to also shine a light on the data we do not know about. I know there are a number of folks who feel like this data is free for the taking, simply because users do not know it exists. It’s not private data, if you don’t know that you have it.
It is this technological sleight of hand that bothers me. It is a form of exploitation via complexity and consuion. If you make something complicated enough on the web, and hide it right beneath the surface, where few people will see–you can not only get away with doing things inside someones personal space, but you can surveil, track, and generate data which you can also sell or use as a raw material in the development of new digital products. It is a particularly invasive form of intrusion that is only being sanctioned because very few people can even see it. The layers of the schema are obfuscated differently, depending on where you are at in the overall supply chain.
I have separate but overlapping concerns when it comes to each of layers of schema in place across major tech providers. I have concerns about what is disclosed to users, as well as what is openly made available to 3rd party developers. But, I have the most concern about the portions of the schema that never see the light of day. The portions that us end-users have no idea exists, even though it is all data about us. The bits of our digital self that tech companies view as commodities, and actively use in products, and sometimes make accessible to partners, but refuse to ever tell us about, let alone give us a voice over what gets collected, and who has access to it. This is the schema that keeps me up at night. I feel like 95% of it will be harmless, and act more as an annoyance, than anything particularly troubling. However, it is the 5% of the schema that I can’t see, that I can’t correct, or that I do not have any voice over that could end up impacting my credit, my career, and have real world consequences in my physical life.