Gamestop, European Superleague, and Archimedes.

28 min readNov 1, 2021

Real-World Events and Virtual Levers: the app I built to take a closer look

3-d plot of Superleague Tweets: 90+ percentile on Influence, color-coded for Sentiment

A few months ago I was brainstorming for my next Python analytics app. It needed to involve Tweets and the Twitter API and nlp (Natural Language Processing), in particular: Sentiment analysis of social media posts. I received a Twitter Developer License in response to my application, and couldn’t wait to spend hours debugging JSON parsings!

I got hooked on Python when I went back to school three years ago for data analytics and machine learning certifications at Emory University. The first big python app I did was when covid shut things down in the spring of 2020. I built an app to pull state and local medical data, cdc and census stats, and plot trends and heatmaps. Now I’m hooked and I need to build something every week!

Without deadlines and scope from a project sponsor, I was on the hook to manage my scope and even worse: manage myself! Things were touch and go for a while, and at my last performance review I threatened to fire myself. But I worked it out with me, and it’s all good now . Me and I got to brainstorming for a story for a social media analytics app…I needed a story with fuzzy lines between public events and social media…politics make me puke so nothing in that domain…had to be a big news event where posts were central to the story…

**Wordcloud** I generated from the Gamestop dataset I built using the **Twitter API**

And two answers hit me! I’d followed both trading in shares of Gamestop and the failed launch of the European Super League (soccer to us over here!), and I’d followed threads of Tweets on both.

One story had an online group of individuals, “dumb money” by Wall Street parlance, anything but by the look of how it turned out, creating a ripple in the market which became a tsunami.

The other story included executives at sporting clubs with global reach horribly misjudging the reaction to their plans, leading to the collapse in three days of their proposed league.

I set out to build a topical dataset of tweets for each story (including learning the Twitter API and data schema and making API requests to build out topical datasets), building metrics to create measures of tweet ‘Influence’ , and scoring tweets for sentiment. Finally, I wanted to bring these things together and plot influence and sentiment in the context of each of the two stories, so I also needed to pull market data and story timelines to plots real world versus Twitter world.

I didn’t expect to find the “smoking gun”, as a Proof of Concept, this project was about coding the end to end pieces, so I won’t have great predictive insight until my next article :-). Just wanted to grab good data and see what I could shake out of it.

The Twitter API, Data Schema, and Parsing JSON…

GME Closing Price and Daily Volume, Jan 8 — Mar 26, 2021 — GME Closing Price and Daily Volume, Friday Jan. 8 through Friday Mar 26, 2021

My first step was to apply for a Twitter Developer license, get smart about the Twitter API endpoints and data schema, and build a dataset with tweets that were on-topic. I defined this basic socialization measure in my app called Influence. It is the aggregate of intrinsic Twitter metrics of Retweet count, Quoted Tweet count, Reply Count, and Favorited count. I did not want duplicate Tweets in my dataset which could skew my measures, so I built custom filtering and scrubbing functions. For example, Retweets, unlike Quoted Tweets, add no new content to a ‘corpus’ (the accumulated text from tweets in my dataset). I wanted Retweet metrics, but not duplicate copies of text content. I built some output into my parser and filter and search functions to get an idea of how the raw dataset was shaping up. More on that later.

Program Output for data importing and de-duping

At first I planned to code the API interface in python, then I discovered a tool called Postman. Postman is highly configurable ‘middleware-like’ tool which provides an interface to public API’s. It handles passing Oauth2 authentication keys, secrets, or bearer tokens, and allowed me to pass query strings, Twitter fields and objects I wanted in the response, or specific Tweet ID’s as variables to the API request formatter in Postman. Postman has several options for interfacing with user files and scripts, from a scripting piece that can be run prior to any API call to a CLI add-on called Newman which can move data between another program or app and Postman.

Please, Please Mr. Postman…

Postman ‘environments’ allow variables to be passed to the Twitter API query endpoints, reponses in JSON

Twitter has a download for Postman which preconfigures queries to each endpoint, called a Postman ‘collection’. I then created a Postman ‘environment’ for each of my target datasets. A Postman environment contains variables to pass to the API request, such as values for selecting data: query strings for text or hashtag searches, Tweet IDs or User IDs, User Names, and start and end dates. I also added fields and objects from the Twitter data schema that I wanted in the response. Postman returns these responses as JSON files. Using Postman streamlined my interface coding, all I needed to do was write a big, hairy json parser to deal with inconsistencies in the data returned from different Twitter endpoints.

View of **Twitter API v2 ‘collection’ in Postman**. my app provides list of missing tweets, prioritized by influence metrics, to GET tweets by ID

There are over 20 query endpoints for the Twitter Developer API. I ended up figuring a process for building out my topic datasets with 5 API calls: the ‘full archive’ search for building the raw set of tweets if they’re more than 30 days old; the ‘GET tweets by Tweet IDs’ which I used to find missing tweets from QT, RT and Reply info; the ‘GET users by usernames’ to pull account info on frequently referenced users; and ‘GET User timeline tweets’ and ‘GET user mention tweets’ to get tweets associated with the most important users on a thread or topic.

In the two stories, there were some users who both contributed very influential Tweets and were extensively referenced in the tweets of others. They included Fabrizio Romano and UEFA’s Aleksander Ceferin for Superleague, and users such as Chamath Palihapitiya, “The Roaring Kitty” and even Elon Musk for Gamestop. To ‘fill-out’ a raw dataset, analysis of hashtags and user_mentions identifies the most influential users in the threads. With my list of important or influential users, I used the “GET user timeline tweets’ and ‘GET user mentions tweets’ API calls to get batches of tweets associated with them. In this case, it would be easier to make the calls to the Twitter API directly from python rather than go through Postman. However, the time I saved not writing code for an API interface myself I was able to spend on other aspects of the app.

Also, Postman has great support for passing data via environment variables and other methods, so I never second-guessed that design decision.

parser excerpt: Twitter API’s have inconsistent date formats, object names, placement of fields to handle…

Influence: an aggregate measure of Retweets, Quoted Tweets, Replies, and Favorited (‘Likes’) Counts for a particular originating Tweet

Another method I used to fill out the raw dataset was to search through each Retweet and Quoted tweet in my raw dataset. With each, I would parse the ‘originating’ tweet ID from its metadata, as well as the metrics on that originating tweet which it also keeps as metadata. This is an example where the robust, layered structure of json records from the Twitter API allow for powerful analysis. Replies also contain information on the tweet or user to which they refer, but not as an object in the JSON response with metadata for the reference, like we get for RT’s and QT’s.

My app first searches the raw dataset to see if it has the originating tweet. If it doesn’t, it creates a record for each missing tweet with its metrics, sorts the list by descending ‘influenced’ (sum of Q-R-R and Fave counts), and writes the list to to be used by Postman.

What is cool about this is the layers of objects and metadata in the Tweet record allow me to identify missing data by ID AND get the metrics to understand how important it is!

Supplementing dataset with search on top user_mentions

I mentioned the analysis of user_mentions and hashtags , and explained how I use the user data. As far as hashtags, I use those more ad-hoc, to add some of the hashtags to query strings for API calls. For example, adding #wsb (wallstreetbets) to (#GME or #Gamestop) to see if I get new threads to add to the topic dataset. By the way, query strings in an http API must be URI-encoded, so ‘#GME’ must be converted to ‘%23GME’, and whitespace must be ‘%20’, prior to sending.

Twitter continues to advance the endpoints available as well as the layers of objects in the Tweet data schema. Twitter solicited detailed feedback from me on their API roadmap, and some ideas they’re testing outswhich I really appreciated, and they shared some planned API features. It is funny how this app which came from the simplest of roots with its 140-character ‘statuser’, has become a tool with such robust metadata and an API with over two dozen specialized endpoints.

Example of Context Annotation for tweet record in JSON response from API

Twitter’s own bots parse through tweets and identify references to people, places, products, and topics. These are known as context annotations. I’ve included an example at left from an actual JSON response from the API. After pulling this entity ID for Super League from the context_annotations object in the response, I caklkn use it on other queries or to match/filter records in my datasets. There is also a ‘filtered stream’ endpoint in the API v2, which includes an API to create rules for stream capture. I had some challenges putting together topical datasets over 30 days old and dealing with monthly record quotas, but I got more efficient with data acquisition as the project moved forward by leveraging the extensive metadata in the Twitter data schema.

Top-16 hashtags and Top-8 user_mentions from Gamestop tweet dataset

After filtering out stuff like #GME or #Gamestop, the dataset had #amc, #wallstreetbets and #wsb as the most frequent hashtags, and @wallstreetbets and @wsbmod (for forum moderator of the same) were in the top-ten for user_mentions. @chamath is Chamath Palihapitiya, billionaire tech investor, and just off the chart but a key figure in the discussion on the r/wallstreetbets threads on Gamestop was a cool cat with the handle @TheRoaringKitty. @elonmusk (needs no intro) and @jimcramer , host of cnbc’s Mad Money, may not have originated lots of Twitter content but were referenced extensively. Any worthy story needs characters like this!

Output from functions which filter dataset then mine metadata for missing ID’s and metrics

To wrap up the data acquisition piece, my big takeaway is to request all fields (individual elements like ‘date created’, ‘id’, or ‘text’ for a tweet) and objects (compound json elements in the response like a geo-place object or user object) that can be leveraged to learn more about the topic or thread space. If I had these extended fields and objects, I had the references to other tweets and users which I could mine to build additional API calls.

OK, enough on the data acquisition, lets talk about Gamestop…

The Gamestop Story

The Gamestop stock rally seemed to get traction starting the week of Monday, January 11 via threads in r/wallstreetbets, a reddit online forum for talk about investments and trading. The chatter about Gamestop (NYSE: GME) also spread to Twitter, where noteworthy posts by well known personalities likely helped the rally (see inset). My interest in including this story as a test bed for Twitter analytics was that the rally seemed to materialize ‘out of the blue’ from retail investors whose connection to each other was virtual. Both the Gamestop rally (or maybe ‘squeeze’ is a better term, more on that below). and the unravelling of Superleague built grassroots momentum via social media.

News Media, Congress and Regulatory Agencies involved in financial markets or consumer protection have allowed special interest groups to wind them up on this issue — Media, Congress, and Regulators Parrotted a View Promoted by Those who Shorted the Stock and Lost!

Prior to Jan 11, Gamstop common shares (Gamestonk is a mashup of Gamestop and ‘stonk’, a common misspelling for stock) had spent a couple years at or below $10/share. The firm had dominated the gaming space when customers shopped in bricks and mortar stores, but there were questions about its ability to transition to a virtual, i.e. e-commerece, world. Through Jan and Feb there was crazy volatility with shares of GME, but since the 3rd week of March, the stock has not dropped below $140/share. That gave me boundaries of Jan 11 and Mar 26 within which I’d build a topical dataset of Tweets.

Plotting Share Open, High, Low, Close and Volume with metadata from my raw

I knew this was a good story for Twitter traffic,, but it also interested me from a business standpoint. Had something fundamentally changed? Under the efficient market theory, we’re taught to doubt that there is latent, hidden value known only to a few. Markets efficiently incorporate all information, just as water seeks its level. Yet for 6 months now the firm’s common shares seem settled at a new, much higher, price. This drove me to look through GME’s 10-K filings to understand their strategy and financial position better- both then and now .

Once a social ‘hub’ for gamers, what’s the future for 4800 stores?

Gamestop became the dominant brand in the retail video game space, in a period when the gamer market made most purchases in stores, after bumming a ride from Mom to the mall. Can the firm leverage its assets, expertise and relationships to dominate the gaming market now that most purchases are done online (via ‘e-commerce’)? Their 4816 worldwide stores at the start of fiscal 2021 must either contribute to that goal or contribute cash when sold. Covid hit retail hard, and GME was no exception with a 21% drop in sales for fiscal 2020 year-end. But markets do efficiently incorporate all information, and covid hit all retail hard, now that it was mainly in the rearview, prob not a huge deal for GME. But in studying the 10-K reports, declining sales for period-to-period, same-store comparisons looked like more than the Covid slump. It’s the sound of a market moving to virtual space.

From Gamestop 10-K filings, note: change in cash and debt positions since rally began

At 2020 fiscal year-end (jan 31), GME shuttered 12.5% of their stores from one year prior. At the end of Q3 2020, they reported an increase in e-commerce sales of 430% y-t-d over 2019. Both positive signs of their strategic transition, but enough with the business report! The inset has my summary of key figures from the last 4 Quarterly reports…

Meme Stocks and Covid Stimulus…

When you think of smart hedge fund types vs. barco-lounging, day-traders, you don’t usually think of the latter party coming out of the encounter in the money. Early coverage in the news media spoke of “irrational exuberance” or “gambling with the family covid stimulus check”. Discount brokerages were blamed for using memes, virtual party streamers and other cheap tricks to incent inexperienced traders to churn stocks. The practice of “payment for order flow”, in which brokerages pay market agents to execute trades, was said to pressure brokerages to squeeze more money out of retail customers. Did it push Robinhood to take advantage of the guppies in the Wall Street food chain? Or was ‘payment for order flow’ just an occasional cost of doing business for online, discount brokerages, a business which has made investing accessible and affordable to millions? Was the sensationalist early news coverage all so much irrelevant noise to what was really happening?

$GME price, volume(size) and gain(color) vs. High-QRRF GME Tweets with Sentiment (color)

Yes, most of it was B.S. I started hearing more plausible explanations for the rally, even a new twist on the “short squeeze”, a strategy that was coined j.i.t. as “the gamma squeeze”. You can google ‘gamma squeeze’, or you can read this excellent article by George Calhoun on Forbes. In “Gamestop/Gamestonk has nothing to do with the madness of crowds”, Calhoun explains the mechanics of short selling, options, and the squeeze as they relate to GME.

From everything i’ve taken in on the story, here’s my summary of the GME rally from January-March:

Not all market transactions are voluntary- there is forced selling…and forced buying, true leverage is knowing when the other fool is in a forced position.
buying calls (the option to buy shares at a specified price on or before a specified date) provides an accelerant to spike a rally. The pressure to ‘cover’ positions by buying shares is known as exposure. This exposure applies to sellers of call options and short sellers of a stock. If exposure exceeds the number of shares available on the market, it’s simple: it looks like parents going after ‘tickle me elmo’s before Christmas.
the Internet/social media can now align huge numbers of small traders- with aggregate buying power that exceeds ‘collaborations’ (or ‘collusions’) we saw in past decades. The aggregate depth of tens of thousands of shallow pockets can exceed a handful of deep pockets, and even insiders may not see it coming. The virtual world transformed where Gamestop’s customers shop for products, and it changed how its investors gained critical leverage.

I still had the question of whether things had changed for Gamestop, ‘the company’. The share volatility dropped off after late March, and stabilized at a much higher price. I went back to the firm’s 1st and 2nd Quarter 2021 10-K filings and I found some items that I haven’t seen in the news coverage. The first is from fiscal 2020 year-end (late January 2021) 10-K:

“In December 2020, we established an “at-the-market” offering program (the “ATM Program”) that provides for the sale of shares of our Class A Common Stock having an aggregate offering price of up to $100 million. To date, we have not sold any shares of our Class A Common Stock under the ATM Program.”

Here’s how Gamestop’s ATM worked (man, I wanna ATM like this…): first quarter, 2021 firm sells 3.5 Million shares of common stock through program, making $551.7 Million in proceeds. 2nd quarter they sell 5 Million shares for $1.126 Billion in proceeds. From my summary of the key numbers, you can see how a 1.678 Billion windfall helps the balance sheet.

I thought of the ‘self-fulfilling prophecy’. a few Tweets and Chats in r/wallstreetbets project confidence in Gamestop’s future then a few more…through some new age market dynamics the stock goes wild…creating an opportunity for the firm to convert its rise in market value into inexpensive cash…allowing them to pay down debt and increase cash on hand…putting them in a strong position to execute their e-commerce transformation. Voila, those “exuberant predictions” were really clairvoyance!!

On Monday, October 18th the SEC released its report on Gamestop, ‘payment for order flow’, and the evolving relationship betweeen online brokerages and retail customers. Their assessment was basically that there was not evidence of consipiracy or fraud, the system worked as designed, and no structural changes are necessary at this time. But they plan to keep an eye on the situation…

OK, back to the app…

Gamestop Tweets at or above 90th percentile for ‘influence’ metrics, but…my 11k sampling is small

Tweet Influence

To build basic measures of social influence into my app I leveraged metrics which are intrinsic to the Twitter model. There is an optional object that can be requested in the JSON response called ‘public_metrics” Three public metrics are counts given for how many times each tweet has been Retweeted, Quote-Tweeted, or Replied to. As a measure of the recirculation of content or opinion, I called this QRR (for Quoted-Retweeted-Replied, catchy, huh? I used to work in product marketing :-) ). The other basic measure that I integrated into my app is the count of times a tweet has been ‘Liked’ or ‘Favorited’ (these are the same, it’s just that Twitter still has inconsistencies with some field and object naming). I abbreviate it and call it ‘Fave’ in my app.

Statistical Distribution of Influence Metrics for Gamestop and SuperLeague Datasets

Between Q-R-R and Fave I felt I had two good proxy measures for Influence. The aggregate of QTs, RTs, and Replies are all related to sharing and/or adding to a thread of content. The higher the Q-R-R, the more socialization and thus the more impressions, meaning Influence. The Fave measure is the best, metric intrinsic to Twitter for content affinity, which is another aspect of Influence. I simply leveraged Q-R-R as a proxy for social spread or impressions, and Fave as a proxy for affinity, and together they would be my best proxy for Influence.

Similar Dimensionality with Q-R-R and Faves, both Originating Tweet and Retweet

Dimensionally, both QRR and Fave are integer values, both skewed by outliers, or a long ‘tail’ of very high counts for a small number of tweets. The majority of tweets are below-5-on both measures. The Twitter data model also includes these metrics in Retweet and Quoted-Tweet records, and for both the current tweet AND the tweet from which it derived, what I call the ‘originating’ tweet. After analyzing my initial, raw datasets I realized this metadata gave me both an ID for missing records and the importance of each, which was vital for building out my topical datasets from an initial raw, incomplete set. I built python functions to iterate through the dataset, identify these threads with retweets and/or quote tweets, lookup the originating tweet ID to see if I had it in the dataset, and if it was missing, prioritize a missing tweet list based on the aggregate influence metrics for each.

Scatter of Tweets that were 90th percentile and above on Influence

I played with combinations of filters for influence and sentiment thresholds on which to match. In my standard run of the app, I take only Tweets at the 90th percentile and up on Q-R-R or Fave metrics. I had four score types that the Vader sentiment engine calculated: compound, neutral, positive, and negative. I experimented with different matching rules and also looked at the distribution of both Influence metrics and Sentiment scores (see insets) to come up with interesting intersections of the distributions.

Log Scale Distribution of Vader Compound, Neutral, Positive and Negative Scores

Since I was building both datasets a few months after the fact, I had to use the full archive API endpoint for the bulk of my records. The problem with that is I have a quota (I have a developer’s license, but on a free tier and not under academic research) which limits me to 5000 records a month through that API. I’m just being open that my datasets each ended up a with just over 10,000 topic-specific tweets each. Not bad for my purposes, but it limited me a bit with the selection criteria I could apply and still get a result set that wasn’t too ‘sparse’! As I got deeper into the data and API, I got a lot smarter about efficiently getting target records from the API. The processs I described, using metrics in the records I had to identify referenced tweets as well as the influence metrics for those reference tweets, is like a precision sniper shot, while using hashtag queries on the full-archive API is more like a shotgun blast.

Sentiment

Positive sentiment could be from expressing certainty that a firm will tank and its shares will go south, or high negative sentiment could come from expressing that anyone who hasn’t bought the stock is a moron. Positive and Negative scoring have to do with phrasing and aren’t necessarily tied to approval or dissapproval of the thread’s main topic.

The tool I integrated on this app, Vader, distributes scoring, from 0 to 1, among neutral, positive, and negative, and then also provides a compound, or overall, score on a -1.0 to +1.0 scale. Vader was designed with a special lexicon to handle social media’s short-written posts and unique set of colloquial symbols, punctuation styles, use of case, acronyms and initials to express emotion or intent. Through Vader’s rules engine, it can adjust sentiment scoring for things like word-order sensitivity and multi-word phrases, degree modifiers, punctuation amplifiers and polarity switches. Sentiment analysis often has three buckets: neutral, positive, and negative, and can be further divided by objective or subjective statements. What the distribution of types of scores can tell us without more granularity is a general measure of the Polarity of threads on a topic. I ran statistical distributions from my Gamestop and Superleague datasets and found that Superleague tweets showed more polarity and a greater standard deviation across multiple sentiment types types.

**ESL vs GME Tweets: ESL higher negative scores plus greater range and variance both neg and compound**

The widespread use of sarcasm in Tweets and all kinds of punctuation, expressions, and multimedia memes, makes measurement a challenge. I chose the nltk-Vader package for its strength with social media posts — and the online ‘shorthand’ which grew organically from Twitter. Vader is based on a lexicon and rules, and can also be extended by the user to adjust its scoring based on the context. I have done this by extending the ‘idioms’ it recognizes and their associated scoring adjustment.

Measuring sentiment in an article, a video, a presentation, a dialogue, or a Tweet are very distinct challenges. I built a small nlp app about a year ago which pulled all the lyrics for a given band through the lyricgenius api, tokenized words and song lines, did tf/idf analysis, and generated wordclouds. I also fed the ‘corpora’ of lyrics to parts of speech analyzers and the Word2Vec engine, to look at simalarities and differences with words and phrases.

Burst of high-influence GME tweets during end of Jan market spike

I mention my lyrics project to note how much written language content differs depending on the context. Song lyrics had certain language challenges, which varied by genre, but it was not nearly as difficult as translating the content from Twitter! Lyrics often break rules of grammar, but they don’t have their own symbolic language, or altnerate use and meaning of punctuation.

Hutto, C.J. & Gilbert, E.E. (2014). VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. Eighth International Conference on Weblogs and Social Media (ICWSM-14). Ann Arbor, MI, June 2014.

A note on managing a corpus as it relates to sentiment scoring. Early on I integrated or stubbed out some nlp tasks for this app (outside of the scope of what I’m writing about today), and I was running cleaning like STOP word removal, lower-case conversion, spelling standardization, or removal of punctuation and non-alpha characters. Since Vader picks up on many sentiment-laden constructs found in tweets, such as repeat punctuation, sequences of punctuation, all caps or ucs-2 symbols, emoticons, acronyms and initials, and slang, any cleaning I did was lowering the performance of the sentiment analyzer! This is maybe an obvious point for some but I found it best to run sentiment analysis with close to the raw dataset, with minimal word or punctuation removal. If running some of those other nlp tasks requiring scrubbed and cleaned text, run a copy of the content on a separate course from that used for sentment scoring! I used a copy of the dataset to clean for other nlp tasks.

section of code that applies Vader sentiment scoring, and then selects Tweets with highest Influence scores

VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media. Vader is available from the Python package library, or as source code on Github.

sample of ucs-2 (double-byte) symbols that impact Sentiment in Tweets

After Sentiment scoring, function outputs top x Tweets for each Sentiment type

The other nlp functions I built into this app included word tokenization, tf/idf (term frequency/ inverse document frequency) analysis, filtering important words and phrases, building word clouds, and using my cleaned content to train vectorizing models. For lexical analysis, such as to id.

Compound sentiment is scored on a range from -1.0 to +1.0, while the other three types are all on a 0 to +1.0 scale. I did the most plotting work with Compound and Negative scores. I’ve included histogram distributions of the 4 sentiment score types, for both projects. exhibit that is a numerical table is perhaps easier to see differences for sentiment between the two projeces.

Same style as prio Sentiment Distribution, but for Gamestop Dataset

I do all my visualization work in Plotly.graph_objects, it allows me fine-grained control, it works well with python and Pandas DataFrames, and supports a wide range of renderers. The most intuitive approach was to use plot color for sentiment and plot size to scale influence, particularly for scatter plots. It also works well to plot compound sentiment on the y-axis as its range is -1.0 to +1.0. For example, in one plot I color code everything above the absolute value of the 90th percentile of scores in one color, from 50 to 90th another, etc. Other plots in the app use the mean and standard deviation of a particular sentiment score to create color bands between mean + 1 s.d., mean -1 s.d., mean -2 s.d., etc. In the plot I inserted above, there is a dual y-axis- stock price for plots of GME shares with size of the marker indicating daily volume traded, and sentiment of tweets during same period where tweet color is based on compound (from neg 1 to pos 1) sentiment and size of tweet is aggregate influence (Q+R+R+Fave counts).

ex. json for an RT (IDs blanked out) showing metrics for RT Plus originating tweet

Plotly has the ability to pass Dataframe columns, or python lists, to variably control marker size or marker color on plots. That gives me 2 extra ‘dimensions’ on a 2-d plot, or 5 total ‘dimensions’ which I can express on a 3-d plot. The hues tend toward green for positive and red for negative, as expected. As Q-R-R is a measure of influence representing how many times the content has been shared, it is intuitive to scale the size of the marker on a plot by this measure, and color the marker for sentiment.

As a potential enhancement, I’m looking at adjusting the weight the Q-R-R metric by the influence factor of the sender. For example, each user who retweets or quote-tweets could have that retweet scaled by their number of followers and such. This weighting factor to calculate and adjusted QRR may be numerically expensive though, I will do some tests. I’d also like to leverage annotations and patterns of hashtags and/or user mentions to understand the flow and importance of major threads within an overall topic, and identify a plot for that.

Managing the Content ‘corpus’

‘Corpus’ is a common nlp term for a large chunk of text, in my case a large chunk made up of thousands of 140-character posts. I just want to mention its helpful to map the state changes you do with core content, and understand some steps for other nlp functions may be really bad to do to the content before running something like Vader sentiment. This means sometimes making a copy of the dataset and sending the copies on different courses. I got hosed when I was learning this and had to back up and rebuild to a step and then split off.

Also, I didn’t find any way around running multiple, manual passes to identify and remove STOP words. I tried to set things up to run the same way with each of my datasets, but there were exceptions. I found generating word-frequency dictionaries and word clouds along the way helped me get a feel for where I was with each dataset and what needed to be removed..

Ok, time for another sidebar from the app: I want to tell you the story of the European Super League…launch

The European Superleage (ESL) Story

Wordcloud from Tweets on European Superleague from April 2021 — Word Cloud from Tweets During the 3 Days of the Failed Launch of European Superleague

On Sunday April 18th, the release stated, “Twelve of Europe’s leading football clubs have today come together to announce…a new midweek competition, The Superleague…” And tried to ease conflict with FIFA and its most profitable division, UEFA (Union of European Football Associations), by stating, “Going forward, the Founding Clubs look forward to holding discussions with UEFA and FIFA…”.

Not all of the launch announcement was so conciliatory, as with “For a number of years, the Founding Clubs have had the objective of improving the quality and intensity of existing European competitions...and of creating a format for top clubs and players to compete on a regular basis”. As for motivations other than high-quality competition, the ESL stated, “Solidarity payments are expected to be in excess of €10 billion during the initial commitment period”. The ESL presser speaks of a 20-club league, but at time of announcement only 12 clubs had committed, 6 in England, 3 in Spain, and 3 in Italy.

In Champions League finals since 2010, Only 3 clubs are not in ‘**the gang of 12**’

The pressures on European soccer are big: lost revenues from Covid-19, lack of binding resolutions to cool down an overheated player transfer market, a poor record with revenue sharing (broadcast, royalty, and gate), loopholes with financial austerity measures to control club spending, and an ever-expanding schedule which over-commits players. Rather than approach caps like the (effective) NFL salary cap for rostered players, UEFA has pressured clubs to balance their budgets and fined non-compliant owners, with mixed results. Covid-19 hit clubs in domestic 2nd and 3rd tier leagues really hard. Most players and staff took pay concessions to keep their jobs.

The ‘wild-west’ transfer market over the last decade has put pressure on top clubs to generate more revenue. That has led to increased match commitments and an almost non-existent off-season, particularly for clubs and players that go deep in year-end tournaments and have international caps in the summer. Succeeding across domestic fixtures, a couple of domestic tournaments, plus one of the UEFA tournaments (Champion’s League or Europa) each season requires a deeper bench than ever, and ability to exchange some of the starting XI. Over-worked players without recovery time are at greater risk of injury, and escalating transfer fees for players means injury downtime is more expensive.

Royalties, Rights plus Gate for UEFA’s Champion’s League is tops in world football

UEFA (the pre-eminent division of the global soccer federation, FIFA) oversees the biggest money-maker and fan draw outside of the World Cup: the annual Champion’s League tournament (UCL). For teams not able to qualify for the UCL, there is the 2nd-tier Europa League tournament. Clubs have a contractual obligation to compete in Europa if selected, but not only is the revenue from Europa much less, going ‘deep’ in the tourney puts demands on the club that can threaten a team’s finish in the domestic table, which is the primary method to qualify for the next year’s Champion’s League (UCL). For teams like Arsenal or Tottenham who see themselves as peers to a Liverpool or Chelsea, the attraction of forming a tournament of elites through the ESL, and being assured of the revenue split is understandable. With the ESL, they also wouldn’t have to deal with stuff like Europa Cup and a selection process in England where there are 4 qualifying slots for UCL each year but 6 or 7 ‘elite’ clubs vying for those slots. Full disclosure, I’m a gunners fan (Arsenal!), but I have mixed feelings on ESL and feel the Best way to proceed is to make reforms to systems that are in place.

Spending caps at Barcelona have led to the exodus of players like Messi

The ‘arms-race’ in European Football has hurt even clubs at the top of the pyramid. Barcelona, a team synonymous with La Liga and Champion’s League titles, now faces a season where La Liga (the Spanish football federation, which comes under UEFA which comes under FIFA) has mandated Barcelona’s spending limit for 2021–21 for transfers AND salaries to be €97.92m! For comparison, Real Madrid’s for this season is €739.12m. A current system which has driven clubs to unsustainable imbalances and led to Barcelona’s fall, is the true raison d’etre for the ESL launch.

Allegations of corruption, boondoggles, bribery and other abuses of power have followed FIFA for years. Yet the ESL leaders found a way to align the average fan in the steet and FIFA!! Yep, The fat cats at FIFA/UEFA and average fans aligned to rabidly protested Superleague. There were interesting bedfellows in the Gamestop story, but nothing like this. But consider that The English Premier League itself was a breakaway from the English Football League in 1992! Rebellious breakaways are not anathema to football.

3-day timeline for ESL shows sharp clustering, or ‘tweetstorms’ around events

However in England, there has been a lot of negative reaction to foreign ownership of premier league teams, particularly American billionaires like the Kroenkes, Glazers or Fenway Sports Group. My analysis of hashtags, user mentions, and phrases of text from Tweets turned up a lot of negative sentiment directed at American ownership of Premier League clubs. Whatever the underlying currents going into the launch, the powers that be in the ESL fatally misjudged them and the strength of the negative reaction swelled in the first 24 hours.

Countdown to Launch…and Bail

All-in-one plot: Superleague launch to flush, influential tweets w/sentiment

On Sunday April 18th at about 11:30 BST, Italian journalist and transfer market expert Fabrizio Romano broke the story about the ESL launch. By that evening, social media was buzzing with protest about it (see charts). By Monday, fans were demonstrating at stadiums, and at the Sheffield-Liverpool match, Sheffield players came out for warmups in a kit which read, “Football is for the Fans”! German football powerhouse Bayern Munich (not a ‘group of 12’ member) released a statement against the ESL and in ‘solidarity with their fans’. Tuesday’s Chelsea-Brighton match at Stamford Bridge (west London) had to be cancelled when players buses could not get through protesting crowds and into the stadium. Just before midnight on Tuesday April 20th, the ESL released a terse press release in which they said plans for the league had been ‘suspended’ for now.

Example of Former Player Sentiment re: Superleague Launch

The story was a perfect petri dish for looking at a topical Twitter storm. How often does a massive plan like a league with billions at stake just crater in three days?

BTW, in most of the plots I’ve included, I filtered on influence metrics to show only the 90th percentile and up, or 80th and up in a few (the tweet had to be in the top % for Either Q-R-R or Fave counts). Of course this is after they have passed through the filter for being on-topic, not a duplicate, and in the target date range. For on-topic, I used both white and black lists, but black lists were easier to filter out fish that got caught in the net, usually because of a shared/similar match on a hashtag search.

3 Day Life of Superleague. Events plus Tweets. marker size shows influence, color shows sentiment

Give me a lever long enough and a fulcrum on which to place it, and I’ll move the world.

Did Archimedes have virtual ‘levers’ in mind when he said that? The Internet is undoubtedly a big lever, but do any of us have the knowledge and power today to set the fulcrum and bring about the correct force? Yes, it likely happened with GME, but there may have been some enabling factors which happened to be aligned. I can look at waves of influential tweets and crescendos of sentiment, but I’m honest that all I’ve been doing so far with social media is descriptive statistics. I hope to get to the predictive soon, but not in this PoC!.

Tweet metadata is available on hover in all plots

For my last words on the Gamestop story: for many years since the inception of the discount brokerage, it seemed the retail investor was operating at an information deficit, which is fatal in an information economy. Social media seems to be a tool which can level many playing fields. No rational and free participant in any system will accept long-term, systemic imbalance. People with self-esteem act to level the field. Some who have gotten used to the favor of systemic imbalance will take things for granted and will get caught out. Water always seeks its level.

On this first foray into social media analytics, the plumbing took a lot of effort!. I plan to drill down further with the Twitter data model and with Vader sentiment, as well as bring in other nlp functions which I’ve partially integrated. Good stuff for a next article :-).

I hope you enjoyed the story, I know it was a bit long.

All the Best,

@bgherbert (created_at: 2008–02–29)

or @yobriangalindo (I’ve been using this Twitter account recently)

Brian G Herbert

I’ll be posting my Python 3.9 app to a repo on my github account, and I’m also on LinkedIN under same account name: briangalindoherbert.