This data visualization is a tribute to Anne Sexton, Pulitzer-prize winning, American poet.
This project was created for a data visualization studio course at the University of Miami. The design brief: Create a data visualization story or essay on a topic of your choice to present online and learn something new (e.g. R, d3.js, CSS, HTML, Illustrator).
Based on these parameters, I decided to focus my project on analyzing poetry to learn text analysis. Transformations by Anne Sexton was a significant collection of work in my life so I chose to investigate Anne Sexton’s poetry as a way to learn the concepts and applications of text analysis as well as the tools used to analyze text (Python or R).
Plain text files were merged to an excel file then cross-checked with Anne Sexton: The Complete Poems.
My first step was to create a corpus of her poems. This required understanding how many poems were written by Anne Sexton, how many were available, how many were published, and in what formats. Many were available online but many others were not so I had to hunt and wrangle the many formats. Web scraping (I used ParseHub) was my first step. Then, a manual copy-paste from PDFs or ebooks. All of her poems went into plain text files and cleaned. Using a script, the text files were finally exported to a single Excel document for analysis in R.
A few screengrabs of the many charts (and many mistakes) created in R to learn text analysis.
I found R to be the right combination of features to analyze the poems. Python had been my first choice given the wide application but the learning curve was high for me given the scheduled deadlines. The book, Text Mining with R, became my Bible. It provided enough background in understanding text analysis as well as how to use R and then, apply it to my project.
Explorations and experiments in R.
The exploration and experimentation in R was a necessary step for me to practice and discover. Some of the questions I had about Anne Sexton’s work included:
Screengrabs of drafts from learning CSS Grid and parallax. I also tried scrollytelling.
Using Excel to identify the most positive and negative poems per collection and through all collections.
My last chart needed to be interactive and that meant learning D3.js. This was the last hurdle of my project but the final deadline was looming so I first looked into backup options such as Flourish, DataWrapper, Charticulator, RAWGraphs, and Tableau. Tableau was the winner but I wasn’t satisfied so in two days I was able to code my first D3 chart. Much of it is hard-coded which isn’t ideal but it works and to code it was a major accomplishment. I’ll be learning more D3 my final semester.
Take good, detailed notes. This is one piece of advice I took to heart from more experienced data viz designers and data journalists. I had to refer back to them quite often as I had to redo analysis and charts. They saved me a great deal of time and potential hours of frustration.
Just dive in. I spent a lot of time being afraid to write lines of code because I was afraid I would break something. Once I got over that hurdle I started to enjoy learning by trial and error. One important lesson: console[dot]log as you write.
Ask for help. When you get stuck, really stuck, it is important to ask for help but make it a learning opportunity. Ask about the mistakes, errors and why a different approach is better. Then do it yourself. Getting immediate feedback is a great way to learn.
Interactive charts are not required. Just because you can, doesn’t mean you must. Most of the charts in Poetry Between Pain are not interactive and they don’t need to be. I was reminded not to get lost in novelty or trends.