Intro
When I started looking into blockchain data by word count for posts, I wasn't sure what I really wanted to achieve. There is a difference between pure reporting, and insights, and I like to think the feature engineering I have performed over the month gives us some insights, but bringing to light data that wasn't present before hand.
Features Built This Month
I have, through this month (and a little bit more) of work, made the following metrics available, that were not present in the raw HIVE SQL database:
- Buckets for Word Counts
- Pay Per Vote
- Pay Per Word
- Pay Per Image
- Count of Images in Post
- Post Outcome (Above/Below Avg Words & Above/Below Avg Pay)
- Detecting Swearing in Posts
- And a few others
I have used a combination of PowerQuery M Code, and DAX to dervice the above things.
This post is about sharing the month that was, with as many insights as I could possibly think of, without completely overwhelming HIVE SQL, or my own poor brain. There have been some challenges in the data set, and there are many accounts (such as curation aggregation posts, spam, bots, burn mechanisms, and others that I have filtered out of the data to ensure I am getting as many human authors featured in this data as possible.
So, for the month of July, what have we achieved?
Not a bad effort. That's a lot of words, a lot of pictures. To put the words into perspective, if an average novel is 85,000 words; that's 445 novels published onto HIVE. 14 novels a day. :) Impressive!
But, it took 7,124 authors to write that content, so, on average, perhaps each author wrote 6% of a novel. If we follow that trend, and authors continue to submit content to hive, that means, an average author will write an average novel on Hive in about 16 months :) Not bad!
But not everyone writes fiction, or wants to write a novel.
There are a HUGE amount of images on HIVE, 795,709 - and we'll go into these numbers later.
Payment figures are also presented in the graphic above, and in the month, around $223k was paid out. The average payout was $2.59.
But as I said at the start of this post, I want to focus on the features that I engineered into the data, which is to be able to look at it by word count:
I will try to let the data speak for itself here:
Word Count and Image Insights
The below is a markdown table, it is a bit wide, you might need to scroll sideways. It breaks down posts by their word count, then a number of other metrics.
Word Count | Posts | % of Content | Total Pay | Max Pay | Avg Pay | Median Pay | Authors | Avg Words Per Picture | Images Found | Avg Images |
---|---|---|---|---|---|---|---|---|---|---|
< 50 Words | 14212 | 16.51% | $64,995 | $196 | $4.57 | $0 | 1771 | 5.46 | 38208 | 2.69 |
< 100 Words | 9726 | 11.30% | $8,523 | $118 | $0.88 | $0 | 1601 | 26.97 | 41300 | 4.25 |
< 250 Words | 17301 | 20.10% | $18,704 | $59 | $1.08 | $0 | 2698 | 54.75 | 104857 | 6.06 |
< 500 Words | 17998 | 20.91% | $36,064 | $82 | $2.00 | $1 | 3250 | 98.63 | 169239 | 9.4 |
< 750 Words | 12491 | 14.51% | $40,094 | $88 | $3.21 | $2 | 2587 | 142.58 | 156412 | 12.52 |
< 1000 Words | 5774 | 6.71% | $21,585 | $64 | $3.74 | $2 | 1741 | 156.00 | 96574 | 16.73 |
< 1500 Words | 5394 | 6.27% | $22,207 | $122 | $4.12 | $3 | 1339 | 165.63 | 115381 | 21.39 |
< 2000 Words | 1574 | 1.83% | $6,203 | $86 | $3.94 | $3 | 560 | 222.29 | 38038 | 24.17 |
< 2500 Words | 693 | 0.81% | $2,766 | $56 | $3.99 | $2 | 251 | 328.83 | 20750 | 29.94 |
> 2501 Words | 904 | 1.05% | $2,084 | $84 | $2.31 | $0 | 168 | 1550.87 | 14854 | 16.43 |
Top Authors by Various Metrics
Again, here, I am letting the data speak for itself. If you want to know a particular author's stats, let me know, and if I get time, you might get a reply - as I will not be able to upload the whole data set here, it is enormous and I don't want to crash your browser. :)
Most Posts
Most Replies
Most Pay
Highest Max Pay
Highest Average Pay
Most Words Published
Most Swearing
Highest Average Word Count
Most Images Posted
Highest Average Images
Highest Pay Per Word
What I have Learned
When I started looking into this data, my hypothesis was that longer posts should get more rewards. What I have seen by looking at this data over the last six weeks or so, is that this is not the case.
As is the case with creative content such as blogs, travel logs, photography, art, music, fiction, philosophy, science, homesteading, code snippets, or the vast other types of content that people post on the platform, there is no single indicator of quality that can be programmatically determined.
We would need to compare like with like, within communities, and remove so many different variables.
There are people who post elaborate signatures, full of referral links, or full of images. There are others who use few words, but use images powerfully. There are others who post their video content, their gaming streams.
Some plump up their word count by providing content in a bilingual format - when we all have browsers that can translate with a single click. Some repeat pictures in these posts, or use layouts with columns.
There is a vast variety of content. We can't compare content by any single metric. We have to look at it with our own eyes, and vote for it with our own standards and sensibilities. That is what makes this platform so good - we can search for what we like, and we can appreciate what we like.
Comments on Purpose
Everyone has a different purpose for hive. For me, I think it will be to step away from the data for a little bit. I wanted to sharpen my skills using PowerBI and Power Query, as this is what I have been doing in a professional job previously, and I don't want to forget how to use the software.
I will be using my time to focus on writing my short story anthology, rambling posts, and a bunch of other, and various things as my interests flit from thing to thing. And well, I want to play some games, too, because I miss gaming. I want to read books, and not continue to go down so many rabbit holes in this data set, now that I have learned some lessons along the way.
I hope that other people found the data useful - but coming back to purpose, please, tell me - What is your purpose for HIVE - are you writing a novel? Are you posting photographs? Are you posting your Art?
What are you passionate about? Because that is one thing that the data cannot tell me.