This month saw big news from the fast-food industry: mass walk-outs and protests over low wages and poor working conditions. We reported on the news itself, but we also wanted to come up with a way readers could more immediately relate the news to their own lives. It’s one thing to read about McDonald’s workers who make minimum wage and live in poverty—it’s another thing to understand exactly how your own behavior and consumption patterns could contribute to the problem.
There has already been some research showing that only a slight increase in menu prices at fast-food restaurants can translate to meaningful wage hikes for workers (assuming, of course, that companies put that increased sales revenue toward salaries—a big assumption). But since these companies depend on low prices for their business model, there’s also a big question as to whether people will pay more for their burgers and fries.
We came up with something we called The McPoverty Calculator, the goal of which is to let you, the reader, decide how much more you’d pay for a Big Mac (or similar menu item) and see how those extra cents add up for workers.
The calculator itself is a small form that allows user to select, through a drop-down menu, 0 cents to 22 cents—the amount extra they’d pay over the $4.56 average price for a Big Mac. Pick a figure and you automatically see how much above the $7.25 minimum wage workers can make as a result, and whether that gets them over the official poverty line.
First let me say that much of the reporting for this piece was done by Filipa Ioannou, our incredible summer intern. She contacted Robert Pollin and Jeanette Wicks-Lim, who have researched this question and who helped us work out the math so it would be bullet-proof. My role was working up the code for the calculator— a delightful challenge that tested a wide variety of skills, ranging from algebra to economics to jQuery.
In order to explain how I produced the calculator, I’ll break the process down into three steps:
Coding the form and the jQuery functionality to reveal the answer.
Getting the mathematical forumla right
Deciding what the form should look like and how the answer should be displayed.
Coding the Form
The drop-down menu, for which I used the html tag “select”, has an id of “#target”. I then used the jQuery function change(); so that anytime a user changed his or her selection, a function would run. This unnamed function first assigns the user’s cents-above value, $(“#target”).val();, to a variable called big_mac.
This function then performs the math necessary to generate the worker’s new wage and annual income. We’ll go over the formula in the next section, but basically after the function runs we now have variables “wage” and “sal” (salary).
To display the answer, I put an empty div with an id of “answer” just below the form. Then, using a series of “if else” statements, the function determines which answer sentence to send to the “answer” div. That code looks like:
if (m_sal < pov){ $('#answer').html("Really? That's the best you can do?..."); }

Where “pov” is set to 18480, the poverty line for a family of three. Since we use change() to call this function, this function runs every time the user selects a new price increase. Cool?
Getting the Math Right
As we explained in the “Notes” paragraph beneath the calculator, the numbers generated are estimates based on research conducted by Jeannette Wicks-Lim and Robert Pollin. We spoke with both Jeanette and Robert, who were both extremely helpful, by phone while putting this all together.
Basically, we decided our data should follow the line-best-fit on page 7 of their paper, namely: y = 0.0454x^(0.6363) where y is increased cost of the Big Mac as a percentage and x is the percentage of wage increase. One major caveat the economists gave us was that their formula only worked for wage increases (as opposed to decreases), and that it only worked for increases of about 95%. This translates into a maximum Big Mac price increase of 22 cents above the average of $4.56. For this reason the drop-down menu only goes up to 22 cents.

Getting that formula into Javascript was a bit tricky (see above for my algebra), but a few if statements and a Math.pow later I had my new wage and annual income.
Form and Display
Now that we have our wage, annual income, and knowledge of whether this puts the worker below or above the poverty line, we just need to decide how this would be displayed on the page.

My first draft let users type their new price into a text box (as opposed to a drop-down menu). I used the jQuery method keyup(); so results would be generated immediately. However once Jeannette and Robert explained the upper bounds of their predictive forumala, we decided to use something that would have a hard upper limit for users.
At the suggestion of ex-Newsbeast Labber Michael Keller, I looked into using Tangle.js, a cool little Javascript library that let’s users click and drag their mouse over a hyperlink and affect values elsewhere on the page. While this served our “hard limit” requirement, the default example didn’t seem intuitive enough for our purposes, and I unfortunately didn’t have time to explore other layouts.
Thus we settled on a drop-down menu for the input form. Now we had to look at ways to display the “answer”.
One draft that I called the “gas station sign” approach displayed the three important numbers in dotted divs.

However we ultimately returned to the answer-as-paragraph approach.
We did add an embedded YouTube clip of the McDonald’s theme song, “I’m Lovin’ It,” for whenever users selected a price increase that got the minimum wage worker out of poverty. My editor wanted the older “two beef patties… special sauce” theme, but Filipa and I convinced her that that was a bit dated. Our editor also suggested we embed a “sad trombone” clip for when the users select a price that keeps the work below the poverty line, but I disagreed and she picked her battle.
Odds and Ends
One cool little thing I grabbed from this Stack Overflow page was the formatMoney function. Rather than the default “toFixed();” call, this function puts commas is the right place for amounts over 999, which I needed for the annual income. Worked great.
Another hurdle for me was the Tweet button. Once users select a Big Mac price increase, in addition to the resulting wage and income being displayed on the page, a Twitter button appears. When users click the button, the default text of the tweet contains their selected price increase and the resulting worker salary. Here’s how I did it (not pretty, but it works).
The button is an image, wrapped in an ‘a’ tag with an id=”twitter-share-button” and an href=”#”. Then I wrote up this click function:
$('.twitter-share-button').on('click', function() { var url = "https://twitter.com/intent/tweet?button_hashtag=McPoverty&url=http%3A%2F%2Fthebea.st/14FkJaM&text=I%20would%20be%20willing%20to%20pay%20"+cents_more+"%20cents%20more%20for%20a%20Big%20Mac%20so%20a%20McDonald%27s%20worker%20can%20earn%20$"+m_sal.formatMoney(0)+"%20a%20year"; window.open(url, 'newwindow', 'width=600, height=450'); });
That’s about it!
— Sam Schlinkert

This month saw big news from the fast-food industry: mass walk-outs and protests over low wages and poor working conditions. We reported on the news itself, but we also wanted to come up with a way readers could more immediately relate the news to their own lives. It’s one thing to read about McDonald’s workers who make minimum wage and live in poverty—it’s another thing to understand exactly how your own behavior and consumption patterns could contribute to the problem.

There has already been some research showing that only a slight increase in menu prices at fast-food restaurants can translate to meaningful wage hikes for workers (assuming, of course, that companies put that increased sales revenue toward salaries—a big assumption). But since these companies depend on low prices for their business model, there’s also a big question as to whether people will pay more for their burgers and fries.

We came up with something we called The McPoverty Calculator, the goal of which is to let you, the reader, decide how much more you’d pay for a Big Mac (or similar menu item) and see how those extra cents add up for workers.

The calculator itself is a small form that allows user to select, through a drop-down menu, 0 cents to 22 cents—the amount extra they’d pay over the $4.56 average price for a Big Mac. Pick a figure and you automatically see how much above the $7.25 minimum wage workers can make as a result, and whether that gets them over the official poverty line.

First let me say that much of the reporting for this piece was done by Filipa Ioannou, our incredible summer intern. She contacted Robert Pollin and Jeanette Wicks-Lim, who have researched this question and who helped us work out the math so it would be bullet-proof. My role was working up the code for the calculator— a delightful challenge that tested a wide variety of skills, ranging from algebra to economics to jQuery.

In order to explain how I produced the calculator, I’ll break the process down into three steps:

  1. Coding the form and the jQuery functionality to reveal the answer.
  2. Getting the mathematical forumla right
  3. Deciding what the form should look like and how the answer should be displayed.

Coding the Form

The drop-down menu, for which I used the html tag “select”, has an id of “#target”. I then used the jQuery function change(); so that anytime a user changed his or her selection, a function would run. This unnamed function first assigns the user’s cents-above value, $(“#target”).val();, to a variable called big_mac.

This function then performs the math necessary to generate the worker’s new wage and annual income. We’ll go over the formula in the next section, but basically after the function runs we now have variables “wage” and “sal” (salary).

To display the answer, I put an empty div with an id of “answer” just below the form. Then, using a series of “if else” statements, the function determines which answer sentence to send to the “answer” div. That code looks like:

if (m_sal < pov){ $('#answer').html("Really? That's the best you can do?..."); }

Where “pov” is set to 18480, the poverty line for a family of three. Since we use change() to call this function, this function runs every time the user selects a new price increase. Cool?

Getting the Math Right

As we explained in the “Notes” paragraph beneath the calculator, the numbers generated are estimates based on research conducted by Jeannette Wicks-Lim and Robert Pollin. We spoke with both Jeanette and Robert, who were both extremely helpful, by phone while putting this all together.

Basically, we decided our data should follow the line-best-fit on page 7 of their paper, namely: y = 0.0454x^(0.6363) where y is increased cost of the Big Mac as a percentage and x is the percentage of wage increase. One major caveat the economists gave us was that their formula only worked for wage increases (as opposed to decreases), and that it only worked for increases of about 95%. This translates into a maximum Big Mac price increase of 22 cents above the average of $4.56. For this reason the drop-down menu only goes up to 22 cents.

image

Getting that formula into Javascript was a bit tricky (see above for my algebra), but a few if statements and a Math.pow later I had my new wage and annual income.

Form and Display

Now that we have our wage, annual income, and knowledge of whether this puts the worker below or above the poverty line, we just need to decide how this would be displayed on the page.

image

My first draft let users type their new price into a text box (as opposed to a drop-down menu). I used the jQuery method keyup(); so results would be generated immediately. However once Jeannette and Robert explained the upper bounds of their predictive forumala, we decided to use something that would have a hard upper limit for users.

At the suggestion of ex-Newsbeast Labber Michael Keller, I looked into using Tangle.js, a cool little Javascript library that let’s users click and drag their mouse over a hyperlink and affect values elsewhere on the page. While this served our “hard limit” requirement, the default example didn’t seem intuitive enough for our purposes, and I unfortunately didn’t have time to explore other layouts.

Thus we settled on a drop-down menu for the input form. Now we had to look at ways to display the “answer”.

One draft that I called the “gas station sign” approach displayed the three important numbers in dotted divs.

image

However we ultimately returned to the answer-as-paragraph approach.

We did add an embedded YouTube clip of the McDonald’s theme song, “I’m Lovin’ It,” for whenever users selected a price increase that got the minimum wage worker out of poverty. My editor wanted the older “two beef patties… special sauce” theme, but Filipa and I convinced her that that was a bit dated. Our editor also suggested we embed a “sad trombone” clip for when the users select a price that keeps the work below the poverty line, but I disagreed and she picked her battle.

Odds and Ends

One cool little thing I grabbed from this Stack Overflow page was the formatMoney function. Rather than the default “toFixed();” call, this function puts commas is the right place for amounts over 999, which I needed for the annual income. Worked great.

Another hurdle for me was the Tweet button. Once users select a Big Mac price increase, in addition to the resulting wage and income being displayed on the page, a Twitter button appears. When users click the button, the default text of the tweet contains their selected price increase and the resulting worker salary. Here’s how I did it (not pretty, but it works).

The button is an image, wrapped in an ‘a’ tag with an id=”twitter-share-button” and an href=”#”. Then I wrote up this click function:

$('.twitter-share-button').on('click', function() { var url = "https://twitter.com/intent/tweet?button_hashtag=McPoverty&url=http%3A%2F%2Fthebea.st/14FkJaM&text=I%20would%20be%20willing%20to%20pay%20"+cents_more+"%20cents%20more%20for%20a%20Big%20Mac%20so%20a%20McDonald%27s%20worker%20can%20earn%20$"+m_sal.formatMoney(0)+"%20a%20year"; window.open(url, 'newwindow', 'width=600, height=450'); });

That’s about it!

— Sam Schlinkert

Today we published a data story looking at how iOS devices fail to accurately correct some words such as &#8220;abortion&#8221; and &#8220;rape.&#8221; Here&#8217;s a detailed methodology on how we did that analysis.
It started back in January when we were working on our project mapping access to abortion clinics. The reporters on the project, Allison Yarrow and myself (Michael Keller) were emailing a lot about the project, which led to us typing the word &#8220;abortion&#8221; into our phones on a fairly regular basis. We noticed that iOS never autocorrected this word when we misspelled it, and when we would double-tap the word to get spelling suggestions, the correctly spelled word was never an option. We decided to look further into whether this could be repeated on iPhones with factory settings and what other words iOS doesn&#8217;t accurately correct. To do this, we decided to find a complete list of words that iOS software doesn&#8217;t accurately correct.
We did this in two stages:

Stage One: Use the iOS API&#8217;s built in spell-checker to test a list of misspelled words programmatically.


Step 1: Get a list of all the words in the English language
We combined two dictionaries for this: The built-in Mac OS X dictionary that can be found in /usr/share/dict on a Mac and the Wordnet corpus, a widely-used corpus of linguistic information, which we accessed through NLTK, a natural language processing library for Python. We left out words shorter than three characters, words in the corpus that were two words (e.g. &#8220;adrenal gland&#8221;), and words with punctuation such as dashes or periods (e.g. &#8220;after-shave&#8221;, &#8220;a.d.&#8221;). We reasoned that these words were either too short to accurately correct or had more variables to them than we would be able to test on an even playing field, so we left them out of our analysis.

Step 2: Create misspellings of these words
We wanted to test slightly misspelled versions of every word in the English language so, to start, we wrote a script that produced three misspellings of each one: one where last character was replaced with the character to the left of it on the keyboard, one where the last character was replaced with the character to the right of it on the keyboard, and a third one where the last character was replaced with a &#8220;q&#8221;. Because modern spellcheck systems know about keyboard layout, these adjacent-character misspellings should be the low-hanging fruit of corrections.
For instance, &#8220;gopher&#8221; would become &#8220;gophet,&#8221; &#8220;gophee,&#8221; and &#8220;gopheq&#8221;.

Step 3: Run these misspelled words through an iOS API spellchecker program.
Apple doesn&#8217;t have a &#8220;spellcheck program&#8221; but for iOS developers, it has an API with a function that will take in a misspelled word and return a list of suggested words in the order of how likely it thinks each suggestion is. In Xcode, the program you use to write iPhone and iPad Apps, you can use a function under the UITextChecker class, called &#8220;guessesForWordRange&#8221; which will do just that. Before testing each word, however, we ran the non-misspelled word through a function in this class called &#8220;rangeOfMisspelledWordInString&#8221; which will tell you whether the word in question exists in the iOS dictionary. This meant that we weeded out words that were in our Wordnet and Mac dictionary lists but that iOS wasn&#8217;t aware of. In other words, we only tested words that if you spelled them correctly on an iOS device, they wouldn&#8217;t get the red underline. For all of our tests we used the then-most up-to-date version of Xcode, 4.6.2, and ran the most up-to-date version of the iOS 6 Simulator.
We also tested whether the misspelled word was in the dictionary and to make sure our misspelled word wasn&#8217;t also a real word. For example, &#8220;tab&#8221; has a right-adjacency misspelling of &#8220;tan&#8221; which is also a word. In that case, the script fell back to the &#8220;q&#8221;-misspelling. So if it was testing &#8220;tan&#8221; as a mispelling for &#8220;tab&#8221; it would see that &#8220;tab&#8221; is a real world and throw &#8220;taq&#8221; at it as the misspelling. Obviously, &#8220;taq&#8221; is a harder misspelling of &#8220;tab&#8221; to correct, but we also gave it &#8220;tav&#8221;, its left adjacency misspelling. If it got either of these right we would count &#8220;tab&#8221; as a word that it can accurately correct. Later on we did many more misspelling combinations as our list got smaller to be sure we gave the spellchecker many chances to correct what should be easy corrections.

Step 4: Analyze results
If a word was accurately corrected at least once, we marked it as properly recognized by iOS. This process narrowed our list down from about 250,000 to roughly 20,000 words. There was one big problem though: the iOS spellcheck didn&#8217;t accurately correct some words that real iPhones were able to correct. For instance, the API wouldn&#8217;t correct &#8220;aruguls&#8221; to &#8220;arugula,&#8221; for some reason. Our questions to Apple on this went unanswered; if anyone has any suggestion as to why the two systems are different, please let us know.
After meeting with some New York-area iOS developer meetup groups, we found that the spellcheck on the iOS simulator as a part of Xcode does correct these edge cases, which led us to stage two.

Stage Two: Use spellcheck on the iOS simulator to check the remaining 20,000 words
To access the word suggestions on the iOS simulator, you need one crucial piece of hardware: a human hand. We were able to write an iOS program easily enough that presents a word on the simulator, but there&#8217;s no way to programmatically pull up the spellcheck suggestion menu because iOS programs don&#8217;t have scope for system level operations. To do that, you need to physically double-click the word and navigate through the various menus. 

Step 1: Find a way to automate clicking
To solve this, we got into our wayback machine and wrote an AppleScript that would move the mouse to specific coordinates on the screen, wait a specified number of milliseconds for menus to appear and then click in the appropriate places. Our iOS program had a button that, when clicked, saved the original word, the presented misspelled word, and the final result of the correction. Our AppleScript script clicked through the menus, replaced the word if the simulator presented a suggestion, then clicked the button to serve the next word. 
We tried to make this process as fast as possible but it ended up taking around 1.6 seconds per word. 1.6  multiplied by 20,000 is 32,000 seconds, which equals 8.8 hours. But we also wanted to present even more misspelling options—twelve more in total.  
We can call this Step 2, create more misspellings:
1. Double last character.
2. Double last character with a capitalized first character.
3. Missing last character.
4. Missing last character with a capitalized first character.
5. Misspelled first character (via left misspelling adjacency) and capitalized first character.
6. Misspelled first character (via left misspelling adjacency).
7. Misspelled first character (via right misspelling adjacency) and capitalized first character.
8. Misspelled first character (via right misspelling adjacency).
9. Misspelled second character (via left misspelling adjacency) and capitalized first character.
10. Misspelled second character (via left misspelling adjacency).
11. Misspelled second character (via right misspelling adjacency) and capitalized first character.
12. Misspelled second character (via right misspelling adjacency).

So, including our first misspelled last character with left/right adjacencies, we had 14 lists of 20,000 words to run through. 14 multiplied by 8.8 hours = 123.2 hours, which is five days if the program ran straight for 24 hours a day. We needed to take a break in between each of the 14 sessions, however, and restart Xcode just in case there was a learning algorithm&#8212;we didn&#8217;t want the results of one session to pollute another.
Renting computers from Amazon is easy and but not if they&#8217;re Mac OS computers, which aren&#8217;t available through Amazon and get rather expensive through other dealers. Fortunately, the Columbia School of Journalism let us take over one of their Mac computer labs and we were able to run script in parallel and finished in a much more reasonable time frame. I was also able to not have my laptop out of commission crunching words for a week. Here&#8217;s a Vine of what the automated corrections looked like: 

One drawback of this method was that we could only get the mouse simulator to select the first suggestion. So, in the scenario that for the misspelled word &#8220;abortiom&#8221;, &#8220;aborted&#8221; was suggested as more likely than &#8220;abortion,&#8221; this program would make that as an inaccurate correction. Wearen&#8217;t too worried about this, though, because 1) our iOS script in stage one *did* take into account multiple suggestions, so all the words had two chances to be corrected in that scenario, and 2) we presented 14 different misspellings of these words and if any one of these variations was correctly spelled then we counted that accurately corrected. If a word that is only off by one character isn&#8217;t suggested that many times, then something in the algorithm isn’t handling that word correctly.

Step 3: Analyze results
This second stage only cut out around 6,000 words, leaving us with 14,000 words that were never accurately corrected. The ++related article++[] lays out our findings but our initial hypothesis that &#8220;abortion&#8221; is a word that iOS doesn&#8217;t correct, unlike Android phones, held true. Apple declined to comment for this project so we have many unanswered questions. One idea for future research is whether iOS devices are incapable of learning certain words like &#8220;abortion.&#8221; That is to say, these words are blocked not just on the dictionary suggestion level, but on the machine learning level as well.

Stage Zero:  Find the files.
Before we did stage 1 we had a different strategy: find this list of seemingly banned somewhere in the iOS file structure. To do this, we put out a call on Facebook for any friends that would donate an old iPhone to be jailbroken. We got three phone: one from my mom, and two from some very nice old friends who mailed them to our offices. We factory-reset and jailbroke one and kept the others as factory-fresh for testing. We went searching and found some promising files in the LinguisticData directory called &#8220;pos&#8221; &#8220;ner&#8221; and &#8220;lemmas&#8221; which in the natural language processing world, stand for &#8220;part of speech&#8221;, &#8220;named entity recognition&#8221; and &#8220;lemmatization,&#8221; which is analyzing word stems and inflected forms like &#8220;better&#8221; being associated with &#8220;good&#8221; as its base. These files were unreadable, however, because they weren&#8217;t in any known format. The only way we could read them was in their raw binary-hex format which looks like that terrible mess of characters you see when you open a corrupted word document—like Wingdings but with less rhyme or reason.
After many attempts at deciphering where a list of blocked words could reside and reaching out to the New York iOS community, we started in earnest with reverse engineering this list ourselves with stage 1.

Today we published a data story looking at how iOS devices fail to accurately correct some words such as “abortion” and “rape.” Here’s a detailed methodology on how we did that analysis.

It started back in January when we were working on our project mapping access to abortion clinics. The reporters on the project, Allison Yarrow and myself (Michael Keller) were emailing a lot about the project, which led to us typing the word “abortion” into our phones on a fairly regular basis. We noticed that iOS never autocorrected this word when we misspelled it, and when we would double-tap the word to get spelling suggestions, the correctly spelled word was never an option. We decided to look further into whether this could be repeated on iPhones with factory settings and what other words iOS doesn’t accurately correct. To do this, we decided to find a complete list of words that iOS software doesn’t accurately correct.

We did this in two stages:

Stage One: Use the iOS API’s built in spell-checker to test a list of misspelled words programmatically.

Step 1: Get a list of all the words in the English language

We combined two dictionaries for this: The built-in Mac OS X dictionary that can be found in /usr/share/dict on a Mac and the Wordnet corpus, a widely-used corpus of linguistic information, which we accessed through NLTK, a natural language processing library for Python. We left out words shorter than three characters, words in the corpus that were two words (e.g. “adrenal gland”), and words with punctuation such as dashes or periods (e.g. “after-shave”, “a.d.”). We reasoned that these words were either too short to accurately correct or had more variables to them than we would be able to test on an even playing field, so we left them out of our analysis.

Step 2: Create misspellings of these words

We wanted to test slightly misspelled versions of every word in the English language so, to start, we wrote a script that produced three misspellings of each one: one where last character was replaced with the character to the left of it on the keyboard, one where the last character was replaced with the character to the right of it on the keyboard, and a third one where the last character was replaced with a “q”. Because modern spellcheck systems know about keyboard layout, these adjacent-character misspellings should be the low-hanging fruit of corrections.

For instance, “gopher” would become “gophet,” “gophee,” and “gopheq”.

Step 3: Run these misspelled words through an iOS API spellchecker program.

Apple doesn’t have a “spellcheck program” but for iOS developers, it has an API with a function that will take in a misspelled word and return a list of suggested words in the order of how likely it thinks each suggestion is. In Xcode, the program you use to write iPhone and iPad Apps, you can use a function under the UITextChecker class, called “guessesForWordRange” which will do just that. Before testing each word, however, we ran the non-misspelled word through a function in this class called “rangeOfMisspelledWordInString” which will tell you whether the word in question exists in the iOS dictionary. This meant that we weeded out words that were in our Wordnet and Mac dictionary lists but that iOS wasn’t aware of. In other words, we only tested words that if you spelled them correctly on an iOS device, they wouldn’t get the red underline. For all of our tests we used the then-most up-to-date version of Xcode, 4.6.2, and ran the most up-to-date version of the iOS 6 Simulator.

We also tested whether the misspelled word was in the dictionary and to make sure our misspelled word wasn’t also a real word. For example, “tab” has a right-adjacency misspelling of “tan” which is also a word. In that case, the script fell back to the “q”-misspelling. So if it was testing “tan” as a mispelling for “tab” it would see that “tab” is a real world and throw “taq” at it as the misspelling. Obviously, “taq” is a harder misspelling of “tab” to correct, but we also gave it “tav”, its left adjacency misspelling. If it got either of these right we would count “tab” as a word that it can accurately correct. Later on we did many more misspelling combinations as our list got smaller to be sure we gave the spellchecker many chances to correct what should be easy corrections.

Step 4: Analyze results

If a word was accurately corrected at least once, we marked it as properly recognized by iOS. This process narrowed our list down from about 250,000 to roughly 20,000 words. There was one big problem though: the iOS spellcheck didn’t accurately correct some words that real iPhones were able to correct. For instance, the API wouldn’t correct “aruguls” to “arugula,” for some reason. Our questions to Apple on this went unanswered; if anyone has any suggestion as to why the two systems are different, please let us know.

After meeting with some New York-area iOS developer meetup groups, we found that the spellcheck on the iOS simulator as a part of Xcode does correct these edge cases, which led us to stage two.

Stage Two: Use spellcheck on the iOS simulator to check the remaining 20,000 words

To access the word suggestions on the iOS simulator, you need one crucial piece of hardware: a human hand. We were able to write an iOS program easily enough that presents a word on the simulator, but there’s no way to programmatically pull up the spellcheck suggestion menu because iOS programs don’t have scope for system level operations. To do that, you need to physically double-click the word and navigate through the various menus. 

Step 1: Find a way to automate clicking

To solve this, we got into our wayback machine and wrote an AppleScript that would move the mouse to specific coordinates on the screen, wait a specified number of milliseconds for menus to appear and then click in the appropriate places. Our iOS program had a button that, when clicked, saved the original word, the presented misspelled word, and the final result of the correction. Our AppleScript script clicked through the menus, replaced the word if the simulator presented a suggestion, then clicked the button to serve the next word. 

We tried to make this process as fast as possible but it ended up taking around 1.6 seconds per word. 1.6  multiplied by 20,000 is 32,000 seconds, which equals 8.8 hours. But we also wanted to present even more misspelling options—twelve more in total.  

We can call this Step 2, create more misspellings:

1. Double last character.

2. Double last character with a capitalized first character.

3. Missing last character.

4. Missing last character with a capitalized first character.

5. Misspelled first character (via left misspelling adjacency) and capitalized first character.

6. Misspelled first character (via left misspelling adjacency).

7. Misspelled first character (via right misspelling adjacency) and capitalized first character.

8. Misspelled first character (via right misspelling adjacency).

9. Misspelled second character (via left misspelling adjacency) and capitalized first character.

10. Misspelled second character (via left misspelling adjacency).

11. Misspelled second character (via right misspelling adjacency) and capitalized first character.

12. Misspelled second character (via right misspelling adjacency).

So, including our first misspelled last character with left/right adjacencies, we had 14 lists of 20,000 words to run through. 14 multiplied by 8.8 hours = 123.2 hours, which is five days if the program ran straight for 24 hours a day. We needed to take a break in between each of the 14 sessions, however, and restart Xcode just in case there was a learning algorithm—we didn’t want the results of one session to pollute another.

Renting computers from Amazon is easy and but not if they’re Mac OS computers, which aren’t available through Amazon and get rather expensive through other dealers. Fortunately, the Columbia School of Journalism let us take over one of their Mac computer labs and we were able to run script in parallel and finished in a much more reasonable time frame. I was also able to not have my laptop out of commission crunching words for a week. Here’s a Vine of what the automated corrections looked like: 

One drawback of this method was that we could only get the mouse simulator to select the first suggestion. So, in the scenario that for the misspelled word “abortiom”, “aborted” was suggested as more likely than “abortion,” this program would make that as an inaccurate correction. Wearen’t too worried about this, though, because 1) our iOS script in stage one *did* take into account multiple suggestions, so all the words had two chances to be corrected in that scenario, and 2) we presented 14 different misspellings of these words and if any one of these variations was correctly spelled then we counted that accurately corrected. If a word that is only off by one character isn’t suggested that many times, then something in the algorithm isn’t handling that word correctly.

Step 3: Analyze results

This second stage only cut out around 6,000 words, leaving us with 14,000 words that were never accurately corrected. The ++related article++[] lays out our findings but our initial hypothesis that “abortion” is a word that iOS doesn’t correct, unlike Android phones, held true. Apple declined to comment for this project so we have many unanswered questions. One idea for future research is whether iOS devices are incapable of learning certain words like “abortion.” That is to say, these words are blocked not just on the dictionary suggestion level, but on the machine learning level as well.

Stage Zero:  Find the files.

Before we did stage 1 we had a different strategy: find this list of seemingly banned somewhere in the iOS file structure. To do this, we put out a call on Facebook for any friends that would donate an old iPhone to be jailbroken. We got three phone: one from my mom, and two from some very nice old friends who mailed them to our offices. We factory-reset and jailbroke one and kept the others as factory-fresh for testing. We went searching and found some promising files in the LinguisticData directory called “pos” “ner” and “lemmas” which in the natural language processing world, stand for “part of speech”, “named entity recognition” and “lemmatization,” which is analyzing word stems and inflected forms like “better” being associated with “good” as its base. These files were unreadable, however, because they weren’t in any known format. The only way we could read them was in their raw binary-hex format which looks like that terrible mess of characters you see when you open a corrupted word document—like Wingdings but with less rhyme or reason.

After many attempts at deciphering where a list of blocked words could reside and reaching out to the New York iOS community, we started in earnest with reverse engineering this list ourselves with stage 1.

The other week, we published our latest interactive: ‘Male Plumage’ Then and Now: The Changing Face of Men’s Fashion. Now we don’t consider ourselves fashionistas, but one morning Michael got an email to upload a Newsweek article to Document Cloud. It happened to be an article from 1968 about men’s plumage, apparently around the time men first started to define a distinct American modern style via patterns and accessories. We looked at the article and started laughing at all the nostalgic memorabilia and vintage photographs.
The Newsweek cover containing the article depicts a Ron Burgundy-like character in a pink suit with some of the fashion items around him, depicted as cut-outs for a paper doll. In 1968 this depiction of changing outfits was for illustrative purposes, but in 2013 how could we re-imagine this cover? Could we see the cover come to life by having the pieces of clothing actually be changeable by the reader? It was fun to think about adding another dimension to print, and to look at something from the past and apply it to something current.
At the same time, Isabel Wilkinson, one of Newsweek’s fashion writers was doing a story on Men’s Fashion Week 2013 in Europe. We thought it would be a great idea to compare men’s plumage in 1968 to the plumage we see today as part of the fashion show.  And so we had a project.
Under the hood
To begin, we imagined the cover pieces as draggable items and the 2013 plumage items would be stored in a drawer to the side. It was a fairly simple layout, allowing the cover to stand on its own and then become transformed by the reader into the moveable pieces. Instead of hardcoding the individual elements, we utilized The Miso Project from GitHub to create JSON from a Google Spreadsheet via an API key, which gave us flexibility to add or remove pieces as we were designing it. We used Underscore.js as our templating engine to make our HTML elements. More about Miso is explained in this ++previous post++ but for this project, under a tight news schedule, we used the ease of filling out an excel spreadsheet to easily get our JSON needed for our template. In our JSON, each object has an id, image to load to the template, layer or z-index value, and a classification of either “then”(1968) or “now”(2013):



In our HTML, we added a script tag to identify our template and classes containing values from our JSON, something like:

In our Javascript, we grabbed the template markup by its id and turned it into an Underscore template function with _.template(). We then appended each row from the spreadsheet (now JSON) to either a div containing the the items of “then” or a div containing the items of “now”, either ‘#plm-canvas-then’ or ‘#plm-canvas-now’. The syntax looks something like this (slightly different than our deadline code, but slightly nicer as well):

To make the elements draggable, we used jQuery UI like this for all of the item elements, since they all had the same class:

Drag events don’t translate too well on mobile, however. To fix that we loaded the plug-in jQuery UI Touch , which translated the appropriate touch handlers for mobile screens.
Now we have all of our elements draggable and organized based on if they are on the cover or in the “drawer”. The drawer element is something Michael came up with to allow the reader to pull and drag items into one place. Further, once the reader makes the plumage combination they like, it could be possible to share their design via social media. I’ll pass it on to Michael now to explain the development of both of these features.
The Drawer and Shareable Links
Thanks Clarisa. This was a fun interactive to work on and had a few tricky details&#8212;this drawer was one of them. As Clarisa said, we wanted to mashup the old style with the new. We also wanted an option to remove items from the canvas once you were done with them so you could clean up your design. In addition to .draggable(), jQueryUI has .droppable(), which lets you drop draggable items.
This was a little tricky because we wanted elements to be absolutely positioned outside the drawer but relatively positioned inside the drawer so they would stack up on top of each other. We handled this through swapping classes on .mousedown() and applying different positioning for each class.
To make it all shareable, we used jQuery BBQ which is super handy. On item drag, we recorded the final x-y position of each element and whether it was in the drawer or the canvas. We used jQuery.bbq.pushState()with a merge state of 0 to accomplish this. On load, we checked to see if someone had a saved state in the hash and if so, drew the plumage to reflect that.
Cues to interactivity

One small detail is the rotation that happens when you start the interactive. We needed to rotate the items because 1) they wouldn’t fit on the model in their original positioning; and 2) we wanted to cue the reader that the hitherto static image was now alive. This was also the logic behind spinning the model’s pinks suit: we needed a way to tell the reader that the image was was in the 21st century.
 
Clarisa Diaz and Michael Keller
 

The other week, we published our latest interactive: ‘Male Plumage’ Then and Now: The Changing Face of Men’s Fashion. Now we don’t consider ourselves fashionistas, but one morning Michael got an email to upload a Newsweek article to Document Cloud. It happened to be an article from 1968 about men’s plumage, apparently around the time men first started to define a distinct American modern style via patterns and accessories. We looked at the article and started laughing at all the nostalgic memorabilia and vintage photographs.

The Newsweek cover containing the article depicts a Ron Burgundy-like character in a pink suit with some of the fashion items around him, depicted as cut-outs for a paper doll. In 1968 this depiction of changing outfits was for illustrative purposes, but in 2013 how could we re-imagine this cover? Could we see the cover come to life by having the pieces of clothing actually be changeable by the reader? It was fun to think about adding another dimension to print, and to look at something from the past and apply it to something current.

At the same time, Isabel Wilkinson, one of Newsweek’s fashion writers was doing a story on Men’s Fashion Week 2013 in Europe. We thought it would be a great idea to compare men’s plumage in 1968 to the plumage we see today as part of the fashion show.  And so we had a project.

Under the hood

To begin, we imagined the cover pieces as draggable items and the 2013 plumage items would be stored in a drawer to the side. It was a fairly simple layout, allowing the cover to stand on its own and then become transformed by the reader into the moveable pieces. Instead of hardcoding the individual elements, we utilized The Miso Project from GitHub to create JSON from a Google Spreadsheet via an API key, which gave us flexibility to add or remove pieces as we were designing it. We used Underscore.js as our templating engine to make our HTML elements. More about Miso is explained in this ++previous post++ but for this project, under a tight news schedule, we used the ease of filling out an excel spreadsheet to easily get our JSON needed for our template. In our JSON, each object has an id, image to load to the template, layer or z-index value, and a classification of either “then”(1968) or “now”(2013):

In our HTML, we added a script tag to identify our template and classes containing values from our JSON, something like:

In our Javascript, we grabbed the template markup by its id and turned it into an Underscore template function with _.template(). We then appended each row from the spreadsheet (now JSON) to either a div containing the the items of “then” or a div containing the items of “now”, either ‘#plm-canvas-then’ or ‘#plm-canvas-now’. The syntax looks something like this (slightly different than our deadline code, but slightly nicer as well):

To make the elements draggable, we used jQuery UI like this for all of the item elements, since they all had the same class:

Drag events don’t translate too well on mobile, however. To fix that we loaded the plug-in jQuery UI Touch , which translated the appropriate touch handlers for mobile screens.

Now we have all of our elements draggable and organized based on if they are on the cover or in the “drawer”. The drawer element is something Michael came up with to allow the reader to pull and drag items into one place. Further, once the reader makes the plumage combination they like, it could be possible to share their design via social media. I’ll pass it on to Michael now to explain the development of both of these features.

The Drawer and Shareable Links

Thanks Clarisa. This was a fun interactive to work on and had a few tricky details—this drawer was one of them. As Clarisa said, we wanted to mashup the old style with the new. We also wanted an option to remove items from the canvas once you were done with them so you could clean up your design. In addition to .draggable(), jQueryUI has .droppable(), which lets you drop draggable items.

This was a little tricky because we wanted elements to be absolutely positioned outside the drawer but relatively positioned inside the drawer so they would stack up on top of each other. We handled this through swapping classes on .mousedown() and applying different positioning for each class.

To make it all shareable, we used jQuery BBQ which is super handy. On item drag, we recorded the final x-y position of each element and whether it was in the drawer or the canvas. We used jQuery.bbq.pushState()with a merge state of 0 to accomplish this. On load, we checked to see if someone had a saved state in the hash and if so, drew the plumage to reflect that.

Cues to interactivity

One small detail is the rotation that happens when you start the interactive. We needed to rotate the items because 1) they wouldn’t fit on the model in their original positioning; and 2) we wanted to cue the reader that the hitherto static image was now alive. This was also the logic behind spinning the model’s pinks suit: we needed a way to tell the reader that the image was was in the 21st century.

 

Clarisa Diaz and Michael Keller

 

Today we published our first &#8220;Daily Beast Feature.&#8221; It&#8217;s called "Death by Indifference" and, through text and videos, it tells the story of history&#8217;s fastest-spreading HIV/AIDS epidemic taking place in Russia.
The project came through a former Senior Producer Gregory Gilderman who went to Russia last year to report on the epidemic with support from the Pulitzer Center on Crisis Reporting. The black and white photos are from photographer Misha Friedman who visited clinics throughout Russia that treat people with tuberculosis and HIV/AIDS. We wanted to place the focus of the page on the video stills and photography, since they highlight the people at the heart of the story. Our design decisions focused around making those images as evocative as possible. The black background lets the woman&#8217;s expression draw out from the page. The videos brighten as you scroll to them, drawing attention on the images and text in that section. That was easy to do using the &#8220;relative mode&#8221; of the Skrollr.js library.
All that said, we think the best part of the design, besides the images, came from Bronson Stamp&#8217;s choice of the beige color for the active nav bar item. Something about fading from grey to beige really makes those section headers lift off the page. Use it with abundance: #f3f0df.
Below, our first pen and paper mock-up:

Michael Keller &amp; Sam Schlinkert

Today we published our first “Daily Beast Feature.” It’s called "Death by Indifference" and, through text and videos, it tells the story of history’s fastest-spreading HIV/AIDS epidemic taking place in Russia.

The project came through a former Senior Producer Gregory Gilderman who went to Russia last year to report on the epidemic with support from the Pulitzer Center on Crisis Reporting. The black and white photos are from photographer Misha Friedman who visited clinics throughout Russia that treat people with tuberculosis and HIV/AIDS. We wanted to place the focus of the page on the video stills and photography, since they highlight the people at the heart of the story. Our design decisions focused around making those images as evocative as possible. The black background lets the woman’s expression draw out from the page. The videos brighten as you scroll to them, drawing attention on the images and text in that section. That was easy to do using the “relative mode” of the Skrollr.js library.

All that said, we think the best part of the design, besides the images, came from Bronson Stamp’s choice of the beige color for the active nav bar item. Something about fading from grey to beige really makes those section headers lift off the page. Use it with abundance: #f3f0df.

Below, our first pen and paper mock-up:

Michael Keller & Sam Schlinkert

Yesterday, we published an illustrated guide on force-feeding at Guantanamo Bay. Here&#8217;s how we made it.
The hunger strike at Guantanamo Bay has been consistently in the news for months but with little action on the part of the Obama administration, there havn&#8217;t been many new developments to report. Abby Haglage, one of our staff reporters, wanted to brainstorm ways we could tell this ongoing story in a fresh way. 
Clarisa Diaz, our NewsBeast Labs Summer Fellow and current graduate student in Design and Technology at Parsons, Abby and I started reading the recent coverage. We all came to the conclusion that force-feeding had been mentioned a great deal — organizations such as the UN and the American Medical Association have denounced it in this circumstance — but none of us had a clear idea as to what this process was really like from reading these stories. So how do we show that? 
After some discussion, a step-by-step animated walkthrough showing the force-feeding process detainees undergo, based on the Gitmo standard operating procedures, seemed like the most effective way to tell that story. To finish off the meeting, Clarisa suggested instead of merely having buttons walkthrough, we time it to the reader&#8217;s scroll &#8212; which is something my deskmate Sam Schlinkert would refer to as &#8220;next level.&#8221;
Under the hood
We here have been on the #lovetoscroll bandwagon since 2011 but this was the first time actually building a scroll-based interactive. After researching some libraries, we liked Skrollr the best. Generally it works by assigning four pieces of information to each &lt;div&gt;: 
1) The starting style;
2) The starting pixel depth for that style;
3) The ending style;
4) The end pixel depth for that style.
When you&#8217;re in between the start and the end pixel points, the library animates between the two styles you&#8217;ve indicated. So on your sample &lt;div&gt;, if you set a starting CSS style of &#8220;opacity: 0;&#8221; at a pixel depth of 0 and and ending CSS style of &#8220;opacity: 1;&#8221; at a pixel depth of 500, then for the first 500 pixels of the page your &lt;div&gt; will slowly fade in. 
It has some other nice features like the ability to set constants so that if you&#8217;re unsure of how big your intro text is going to be, you can add a constant that you can change later (essentially a variable).
Our problem, though, was we were unsure of a lot more besides just an intro height. For instance, we didn&#8217;t know how many steps we would end up having, nor did we know if each step would require the same amount of pixels to scroll through or if some steps would need 50 pixels of scrolling and longer ones 500 pixels. We didn&#8217;t know the total distance, and therefore the speed, that we wanted the scroll to take up, &#8220;Will this take 100 pixels of scrolling to get through or 20,000?&#8221; And lastly, we needed the ability to completely redo any of these decisions at the last minute, since that is a possibility that can likely happen in a newsroom.
In short, we didn&#8217;t want to hard code any step pixel values. Ideally, we&#8217;d define a total scroll distance in pixels, and percentages along that distance that each step would take up. We did this by making an object that has our steps and the percentage of the total length that they will take up. Here&#8217;s a gist of the function that computes, for each step, its starting pixel value, ending pixel value, length and cumulative percentage of the viz that has been scrolled through so far. This is what the step value object looks like after its run.

We decided to add all of our SVG layers (more on those in a second) dynamically so we could, in turn, calculate the waypoints for Skrollr based on these step values. Simplified, an element that would fade in over the first step would something like this, where &#8216;_deck&#8217; is the previously discussed constant for the intro text:
var $element_to_fade_in = $(&#8216;&lt;div&gt;&lt;/div&gt;&#8217;)
 .html(svg_of_element_to_fade_in)
 .attr(&#8216;data-_deck-&#8216;+STEP_VALUES.step_0.start_pixel, &#8216;opacity: 0;&#8217;)
 .attr(&#8216;data-_deck-&#8216;+STEP_VALUES.step_0.end_pixel, &#8216;opacity: 1;&#8217;);
To animate the tube and the Ensure pumping through the tube, it was simply a question of animating the &#8216;stroke-dashoffset&#8217; style on an SVG path, which will only draw part of a path. To animate it in reverse, make the &#8216;stroke-dashoffset&#8217; negative.
We had a couple of other things that would listen for what step we were in so that we could add navigation buttons and things. But once this framework was in place, we could fine-tune the animation to go with the text.
But first things first. We started with the illustrations, which were handled by Clarisa, so I&#8217;ll hand it over to her!
Why are (some of) these SVGs different from all other SVGs?
Thanks Michael, it&#8217;s great to be here. Visualizing this procedure as it happens in Guantanamo was something we hadn&#8217;t seen. And although the facts had all been previously published, the visual approach communicated force-feeding in a more visceral and memorable way.
We drew each illustration layer in Adobe Illustrator. We based them on multiple medical illustrations to customize a backdrop that highlighted each of the parts of the body we were interested in. It&#8217;s also just more fun to make your own graphics. After reading through the operating procedure and doctors&#8217; descriptions of the process, we knew the areas that force-feeding affected (nasal cavities nerves, throat, chest) and the specific steps described in the Guantanamo SOP (attaching tape to the nose, giving the drug Reglan to aid in digestion, but which is also known to cause Parkinsons-like side effects, and the pumping of Ensure into the stomach). We drew these layers &#8212; or added a photo in the case of the Reglan &#8212; and set to work.
Why SVGs over images? We wanted to animate parts of the illustrations without creating masks. We could have made the animated layers SVGs and the static layers PNGs, but we also wanted one system to keep the layers consistently scaled from small monitors to large ones. When we mixed image formats, the images didn&#8217;t scale in the same way. By saving them all as SVGs in the same canvas dimensions, we were sure they would all scale consistently. Also, with SVGs we could dynamically change properties of some layers such as the color of the nerves and the thickness of the tube. We loaded the SVGs through &lt;script&gt; tags with unique IDs so we could dynamically control layer order and apply the scrolling attributes detailed above.
But we added one trick. SVGs can be large files (over 1 mb). Our chest and throat layers, for instance, were made up of a bunch of small dots to create the gradient effect and all these SVGs were decreasing load time. 

In order to dramatically decrease the file size, we saved them as PNGs and embedded the PNG in our SVG canvas. This way we were able to maintain the consistent scaling of our SVG system while keeping file sizes almost as small as PNGs. While it might seem counter-intuitive or roundabout, it appears to be very effective and applicable to anyone working with illustrating on the web trying to get around long load times.
Once we had our SVGs in their corresponding divs, we could manipulate them as elements in Javascript. While I was drawing, Michael was organizing the content into the steps and we could plug one system into another. Back to you Michael!
Designing for social
Thanks Clarisa! 
One of the design decisions we were thinking through was whether to only show the text for the current step or to cascade the steps as you scroll down. We went with the second option for two reasons.
1) Stacking the steps one after another emphasizes the downward motion of the scroll and the tube. It&#8217;s true that people are more used to scrolling as an interface, but it&#8217;s also true that your design should encourage the interaction that you want. If we had a bunch of horizontal lines as a part of the design, the response to scroll down would be far less natural. 
2) We asked Sam, our Deputy Social Media editor what he thought and his answer was simple: &#8220;If you make them all visible when you get to the end, we can screenshot and use the image for Facebook and Tumblr.&#8221; It was nice that we could cover different platforms at once and a good example of thinking about multiple platforms as you design.
Why is this scroller different from all other scrollers?
Our scrolling behaved differently in one important way than normal parallax scroll sites &#8212; normal sites are one long page with different events flying in and out of view as you scroll. See the Skrollr demo for an example. Our interactive, however, sized the canvas to the full viewport height, fixed it to the viewport&#8217;s top and scrolled it over a distance of arbitrary length and faded layers in and out. The interactive behaves like a sticky nav, going from relative to fixed position after a certain depth. We also wanted to contain the interactive to a bounding box so that you could still see the comments when you got to the bottom of the page. We made a jQuery plugin, floating-scroll.js to help with that. If your scrolling site is laid out the normal way, just keep that in mind that not everything will be completely translatable to your implementation.  
Going SVG-less
 Not everyone supports SVG. Internet Explorer doesn&#8217;t like it and even iPhones and iPads have different scroll engines that didn&#8217;t quite get along with our CMS. For those browsers and devices we also served the flat image.
Prototyping
Here&#8217;s a video demonstrating how our scroll plug-in works. The cutout&#8217;s the viewport, the red clipboard&#8217;s the page, the book is the bounding box and the Newsweek post-it pad is content. Below our initial storyboards. Not everything starts out pretty.

Yesterday, we published an illustrated guide on force-feeding at Guantanamo Bay. Here’s how we made it.

The hunger strike at Guantanamo Bay has been consistently in the news for months but with little action on the part of the Obama administration, there havn’t been many new developments to report. Abby Haglage, one of our staff reporters, wanted to brainstorm ways we could tell this ongoing story in a fresh way.

Clarisa Diaz, our NewsBeast Labs Summer Fellow and current graduate student in Design and Technology at Parsons, Abby and I started reading the recent coverage. We all came to the conclusion that force-feeding had been mentioned a great deal — organizations such as the UN and the American Medical Association have denounced it in this circumstance — but none of us had a clear idea as to what this process was really like from reading these stories. So how do we show that?

After some discussion, a step-by-step animated walkthrough showing the force-feeding process detainees undergo, based on the Gitmo standard operating procedures, seemed like the most effective way to tell that story. To finish off the meeting, Clarisa suggested instead of merely having buttons walkthrough, we time it to the reader’s scroll — which is something my deskmate Sam Schlinkert would refer to as “next level.”

Under the hood

We here have been on the #lovetoscroll bandwagon since 2011 but this was the first time actually building a scroll-based interactive. After researching some libraries, we liked Skrollr the best. Generally it works by assigning four pieces of information to each <div>: 

1) The starting style;

2) The starting pixel depth for that style;

3) The ending style;

4) The end pixel depth for that style.

When you’re in between the start and the end pixel points, the library animates between the two styles you’ve indicated. So on your sample <div>, if you set a starting CSS style of “opacity: 0;” at a pixel depth of 0 and and ending CSS style of “opacity: 1;” at a pixel depth of 500, then for the first 500 pixels of the page your <div> will slowly fade in. 

It has some other nice features like the ability to set constants so that if you’re unsure of how big your intro text is going to be, you can add a constant that you can change later (essentially a variable).

Our problem, though, was we were unsure of a lot more besides just an intro height. For instance, we didn’t know how many steps we would end up having, nor did we know if each step would require the same amount of pixels to scroll through or if some steps would need 50 pixels of scrolling and longer ones 500 pixels. We didn’t know the total distance, and therefore the speed, that we wanted the scroll to take up, “Will this take 100 pixels of scrolling to get through or 20,000?” And lastly, we needed the ability to completely redo any of these decisions at the last minute, since that is a possibility that can likely happen in a newsroom.

In short, we didn’t want to hard code any step pixel values. Ideally, we’d define a total scroll distance in pixels, and percentages along that distance that each step would take up. We did this by making an object that has our steps and the percentage of the total length that they will take up. Here’s a gist of the function that computes, for each step, its starting pixel value, ending pixel value, length and cumulative percentage of the viz that has been scrolled through so far. This is what the step value object looks like after its run.

image

We decided to add all of our SVG layers (more on those in a second) dynamically so we could, in turn, calculate the waypoints for Skrollr based on these step values. Simplified, an element that would fade in over the first step would something like this, where ‘_deck’ is the previously discussed constant for the intro text:

var $element_to_fade_in = $(‘<div></div>’)

.html(svg_of_element_to_fade_in)

.attr(‘data-_deck-‘+STEP_VALUES.step_0.start_pixel, ‘opacity: 0;’)

.attr(‘data-_deck-‘+STEP_VALUES.step_0.end_pixel, ‘opacity: 1;’);

To animate the tube and the Ensure pumping through the tube, it was simply a question of animating the ‘stroke-dashoffset’ style on an SVG path, which will only draw part of a path. To animate it in reverse, make the ‘stroke-dashoffset’ negative.

We had a couple of other things that would listen for what step we were in so that we could add navigation buttons and things. But once this framework was in place, we could fine-tune the animation to go with the text.

But first things first. We started with the illustrations, which were handled by Clarisa, so I’ll hand it over to her!

Why are (some of) these SVGs different from all other SVGs?

Thanks Michael, it’s great to be here. Visualizing this procedure as it happens in Guantanamo was something we hadn’t seen. And although the facts had all been previously published, the visual approach communicated force-feeding in a more visceral and memorable way.

We drew each illustration layer in Adobe Illustrator. We based them on multiple medical illustrations to customize a backdrop that highlighted each of the parts of the body we were interested in. It’s also just more fun to make your own graphics. After reading through the operating procedure and doctors’ descriptions of the process, we knew the areas that force-feeding affected (nasal cavities nerves, throat, chest) and the specific steps described in the Guantanamo SOP (attaching tape to the nose, giving the drug Reglan to aid in digestion, but which is also known to cause Parkinsons-like side effects, and the pumping of Ensure into the stomach). We drew these layers — or added a photo in the case of the Reglan — and set to work.

Why SVGs over images? We wanted to animate parts of the illustrations without creating masks. We could have made the animated layers SVGs and the static layers PNGs, but we also wanted one system to keep the layers consistently scaled from small monitors to large ones. When we mixed image formats, the images didn’t scale in the same way. By saving them all as SVGs in the same canvas dimensions, we were sure they would all scale consistently. Also, with SVGs we could dynamically change properties of some layers such as the color of the nerves and the thickness of the tube. We loaded the SVGs through <script> tags with unique IDs so we could dynamically control layer order and apply the scrolling attributes detailed above.

But we added one trick. SVGs can be large files (over 1 mb). Our chest and throat layers, for instance, were made up of a bunch of small dots to create the gradient effect and all these SVGs were decreasing load time.

image

In order to dramatically decrease the file size, we saved them as PNGs and embedded the PNG in our SVG canvas. This way we were able to maintain the consistent scaling of our SVG system while keeping file sizes almost as small as PNGs. While it might seem counter-intuitive or roundabout, it appears to be very effective and applicable to anyone working with illustrating on the web trying to get around long load times.

Once we had our SVGs in their corresponding divs, we could manipulate them as elements in Javascript. While I was drawing, Michael was organizing the content into the steps and we could plug one system into another. Back to you Michael!

Designing for social

Thanks Clarisa! 

One of the design decisions we were thinking through was whether to only show the text for the current step or to cascade the steps as you scroll down. We went with the second option for two reasons.

1) Stacking the steps one after another emphasizes the downward motion of the scroll and the tube. It’s true that people are more used to scrolling as an interface, but it’s also true that your design should encourage the interaction that you want. If we had a bunch of horizontal lines as a part of the design, the response to scroll down would be far less natural. 

2) We asked Sam, our Deputy Social Media editor what he thought and his answer was simple: “If you make them all visible when you get to the end, we can screenshot and use the image for Facebook and Tumblr.” It was nice that we could cover different platforms at once and a good example of thinking about multiple platforms as you design.

Why is this scroller different from all other scrollers?

Our scrolling behaved differently in one important way than normal parallax scroll sites — normal sites are one long page with different events flying in and out of view as you scroll. See the Skrollr demo for an example. Our interactive, however, sized the canvas to the full viewport height, fixed it to the viewport’s top and scrolled it over a distance of arbitrary length and faded layers in and out. The interactive behaves like a sticky nav, going from relative to fixed position after a certain depth. We also wanted to contain the interactive to a bounding box so that you could still see the comments when you got to the bottom of the page. We made a jQuery plugin, floating-scroll.js to help with that. If your scrolling site is laid out the normal way, just keep that in mind that not everything will be completely translatable to your implementation.  

Going SVG-less

 Not everyone supports SVG. Internet Explorer doesn’t like it and even iPhones and iPads have different scroll engines that didn’t quite get along with our CMS. For those browsers and devices we also served the flat image.

Prototyping

Here’s a video demonstrating how our scroll plug-in works. The cutout’s the viewport, the red clipboard’s the page, the book is the bounding box and the Newsweek post-it pad is content. Below our initial storyboards. Not everything starts out pretty.

image

This week, we&#8217;re super excited that our project This Is Your Rep On Guns was nominated for a Data Journalism Award in the Data-Driven Application category.
As we wrote about a couple of times on this blog, we have been tracking all 530+ representatives&#8217; positions on gun control, updating it when news happens, and publishing their statements automatically via our Twitter robot @YourRepsOnGuns. 
It&#8217;s been a very fun project and one we hope that has been informative. We&#8217;ve received a number of emails from readers tipping us off to local articles and they&#8217;ve sent in letters from their rep explaining their position on gun control. 
If you want to help us be able to do more projects like this, there&#8217;s a People&#8217;s Choice component of the awards where you can vote for us. 
If you like the project and want to give us your vote, you can find us here. We&#8217;re only 70 votes away from first place! (Sometimes the site asks you to also log in to Facebook, but nothing gets posted from your account, we&#8217;ve checked.)
Thanks for the help these past few months!
 
-Michael

This week, we’re super excited that our project This Is Your Rep On Guns was nominated for a Data Journalism Award in the Data-Driven Application category.

As we wrote about a couple of times on this blog, we have been tracking all 530+ representatives’ positions on gun control, updating it when news happens, and publishing their statements automatically via our Twitter robot @YourRepsOnGuns. 

It’s been a very fun project and one we hope that has been informative. We’ve received a number of emails from readers tipping us off to local articles and they’ve sent in letters from their rep explaining their position on gun control. 

If you want to help us be able to do more projects like this, there’s a People’s Choice component of the awards where you can vote for us.

If you like the project and want to give us your vote, you can find us here. We’re only 70 votes away from first place! (Sometimes the site asks you to also log in to Facebook, but nothing gets posted from your account, we’ve checked.)

Thanks for the help these past few months!

 

-Michael

Guest Post: Michael Keller from Newsbeast Labs

We released our interactive map templates that use Leaflet.js and CartoDB for all to use. Michael wrote a post about them on the CartoDB blog, reblogged below. Feel free to grab them from GitHub and use them:

cartodb:

As some of you may already know, Newsweek / The Daily Beast has been using CartoDB for some time now, and as such today’s blog post comes from Michael Keller of Newsbeast labs. We’d also like to take the opportunity thank Michael for his amazing contributions to the CartoDB community. Thanks!  

A number of recent stories at the Daily Beast have had some kind of mapping component. We use them often to let people see how a national topic affects readers’ local areas.

image

http://mhkeller.github.com/cartodb-templates/ I have been reusing code from former projects and so it was about time I standardized them into reusable templates with Leaflet.js. I released them on Github this week. 

I made three categories: basic map with hover states, hover states + hover infowindow, and all of that with templated infowindows using Underscore.js.

In each of these categories you’ll see a template for a point map, a polygon map, and a map with both points and polygons.

Some features:

• On point + polygon templates, the polygon hover state turns off when you hover over a point.

• Hover windows follow the mouse and respect the boundaries of the map-canvas. I find it most useful to have hover windows close to the mouse so your eye doesn’t have to leave that map region to see that region’s details

image

• Templates with Underscore.js hover windows include sample formatHelper functions to act as a formatting layer between your data values and how you want them to display. For instance, you could store all your feature attributes as boolean variables and run them through various formatHelpers functions to return nice display strings.

• The hover states work by storing a simplified GeoJSON representation of that feature as a feature attribute. On featureOver, that GeoJSON is plotted as a vector using Leaflet.js.

• Point + polygon templates add a secondary style class to hover windows when hovering over points to differentiate from polygons.

If you have any questions, I’m at @mhkeller. If you have improvements, pull requests at http://github.com/mhkeller/cartodb-templates

It was the Monday morning news meeting and all we could talk about was Dennis Rodman, Kim Jong-un, and Vice Media. It was the strangest story of the week, and utterly riveting. But there was more to it than what had already been reported.
The Vice show is funded by HBO, which is owned by Time Warner, which has a boatload of shareholders. Vice Media itself has a whole range of wealthy investors, including former Viacom CEO Tom Freston and The Raine Group, a who&#8217;s who of one-percenters. Meaning: a lot of people stand to benefit from the hospitality/propaganda machine of the one of the world&#8217;s most notorious dictators.
Reporter Caitlin Dickson started looking at the various connections, using LittleSis.org, an online database that tracks the social connections among the powerful—politicians, business leaders, lobbyists, hedge funders, etc. LittleSis is a project of the nonprofit Public Accountability Initiative and a great, easily-searchable, user-friendly source for reporters. You can search people and companies, find out how they&#8217;re connected to each other and to whom they donate money.
Check out the screenshot for an example, and the story for the final product. Caitlin was on deadline pressure so was only able to scrape the surface on the Vice story, but there&#8217;s no doubt much more to mine.

-Paula Szuchman, deputy managing editor

It was the Monday morning news meeting and all we could talk about was Dennis Rodman, Kim Jong-un, and Vice Media. It was the strangest story of the week, and utterly riveting. But there was more to it than what had already been reported.

The Vice show is funded by HBO, which is owned by Time Warner, which has a boatload of shareholders. Vice Media itself has a whole range of wealthy investors, including former Viacom CEO Tom Freston and The Raine Group, a who’s who of one-percenters. Meaning: a lot of people stand to benefit from the hospitality/propaganda machine of the one of the world’s most notorious dictators.

Reporter Caitlin Dickson started looking at the various connections, using LittleSis.org, an online database that tracks the social connections among the powerful—politicians, business leaders, lobbyists, hedge funders, etc. LittleSis is a project of the nonprofit Public Accountability Initiative and a great, easily-searchable, user-friendly source for reporters. You can search people and companies, find out how they’re connected to each other and to whom they donate money.

Check out the screenshot for an example, and the story for the final product. Caitlin was on deadline pressure so was only able to scrape the surface on the Vice story, but there’s no doubt much more to mine.

-Paula Szuchman, deputy managing editor

Last month we published a package of stories marking the fortieth anniversary of the Roe v. Wade decision. It had a few moving parts but I&#8217;ll just go over some of them briefly here.
How it started
This summer you probably heard the story about the last abortion clinic in Mississippi that was threatened to close due to stricter state laws. Allison Yarrow, who sat across from me at the time, was covering the story and it got us thinking: the line &#8220;The Last Abortion Clinic in Mississippi&#8221; is attention grabbing, but it doesn&#8217;t tell the whole story. That is to say, what you really want to know is how far are people away from their nearest clinic, regardless of state boundaries. One state may have five clinics but if they&#8217;re all in the southwest corner of the state and you live in the northeast corner, and your adjoining states have multiple clinics but only at their borders farthest from you, then you&#8217;ll have a hard time getting to a clinic, even if you had many in your state. To see where this might be the case and where access to services was compounded by new restrictive provisions (over 150 nationally in the past two years) we made as close to a comprehensive database as possible of every abortion clinic. Our goal was to see what parts of the country were farthest from a clinic. From start to finish, this process took about six months. 
We got our address data from a variety of publicly available sources: Planned Parenthood, the National Abortion Federation, anti-abortion websites that keep their own lists and others. We needed to verify that the address information was correct, though, so we called over 750 clinics to confirm. We also asked them up to how many weeks they offer services. The resulting database is the only one of its kind that we know of. The Guttmacher Institute undertook an abortion provider census in 2008 but they didn&#8217;t separate clinics from hospitals from private doctors offices, which represent different levels of care that we thought was an important distinction.
What it became
We started this in July and the project evolved. We thought the election might bring the issue of abortion access to the fore but it didn&#8217;t and that gave us more time. Allison brought up the fortieth anniversary of Roe v. Wade and that let us think much bigger about the project. Because this was such a personal subject matter, we knew readers&#8217; comments would feature prominently (from both sides of the issue) and we wanted a strong narrative component, too.
To give a human voice to the Geography of Abortion Access map, Allison flew to Wichita, Kansas, one of the areas that stood out both on our map, as a metro area far from a clinic, as well as in recent memory as the site of the 2008 murder of late-term abortion provider George Tiller. To add a broader perspective, Sam Register who runs the Newsweek Archivist tumblr went through the Newsweek archives so people could follow the topic&#8217;s coverage from the 70s through the 00s.
What we learned from reader&#8217;s stories
Over the course of the week, we shifted the question we were asking from why do you support or oppose legal abortion to a conversation about pro-life and pro-choice labels as a way to get more nuanced opinions and show the complexity of the issue. We asked readers to complete either the phrase &#8220;I&#8217;m pro-life but&#8230;&#8221; or &#8220;I&#8217;m pro-choice but&#8230;&#8221; We got more responses from our other reader-based projects but we were happy in how thoughtful and honest people were. Read our roundup of interesting responses to those questions as well as our free form &#8220;Tell us your story&#8221; prompt here.
Under the hood on the map
How to represent this dataset was tricky. We had three main issues: anonymity, unbiased geography, and context. 
Anonymity: Although we got our data from publicly available websites that anyone could find and was often information that anti-abortion groups already held, we weren&#8217;t comfortable publishing addresses, names, or exact latitudes and longitudes. We took great care to do things like scrub our final database of anything identifiable and we partially randomized each clinic&#8217;s location so they weren&#8217;t pinpoint-able from our map. On the presentation level, we added the magenta circle big enough to span multiple hexagons (our base geography layer) to let people know that an address was approximate. Even if you backtrack and find our database, you won&#8217;t get any information that would let you de-anonymize the data.
Unbiased geography: As I wrote above, we wanted to get away from the arbitrary state and county borders that most all of the research we encountered was based on. We did some initial plots using Census tracts but that presents exactly the same problem [photo]. We ended up making a hexagonal grid using the Repeating Shapes plugin for ArcMap, which lets you make a grid out of your choice of shape and size. The trick to making a hexagonal grid for the web so that the hexagons will be regular (all sides equal) no matter what degree of latitude they fall on is to make the grid in your output projection, Web Mercator EPSG: 3857. You can reproject it to do your analysis in whatever you like, but because it will eventually be displayed in Web Mercator, it will need to be created in that so as not to come out distorted in the browser. If you want a 20,000 meter in diameter hexagonal grid, here&#8217;s the one we used:  Shapefile, KML, GeoJSON.
And here&#8217;s another one that Brian Abelson, current Knight-Mozilla Fellow at the New York Times, made while he was helping out on the project. They are also 20,000 meter hex grids. This one has the state borders preserved in case you want to assign state values to each hexagon: Shapefile, KML, GeoJSON.
Context: Generating our distance map wasn&#8217;t enough to tell a story with. We added three other pieces of information that would walk people through the significance of the patterns they were seeing. The first was a map of female population aged 15-44 so that people could see the areas where women lived that were farthest away from clinics and identify significant metro areas (the pink dot density overlay). The second was the different legal restrictions that each area was subject to (areas with highlighted transparency). Again, this was an interesting way to visualize this data because we didn&#8217;t highlight every hexagon in Kansas, for example, to show that certain laws were applicable in Kansas. Instead, we highlighted hexagons whose closest clinic was in Kansas. This gave us a very realistic map so that people could see what state laws they would be subject to if their nearest clinic was across state lines. It also visually demonstrates how state laws can affect people that don&#8217;t live in that state. And third, we selected our own highlights from going through the data, such as the areas where telemedicine is banned in conjunction with mandatory in-person counseling. The combination of these laws in Arizona, for instance, means some women travel over a hundred miles and spend two days to get a prescription for the abortion bill. 
More under the hood
The map itself we built using CartoDB, which allowed us to very flexibly add the different highlighted views of the map without rebaking our tiles each time.The slider that shows clinics that only offer services up to X weeks we did by loading four tile layers on top of each other at once and show/hiding them depending on the slider value. This made the map slightly slower on initial load but it made the transitions between map states super fast — so a trade-off. 
For the highlighted states, those restyle and reload all four map layers as well. We used Leaflet.js&#8217;s ability to plot vectors to draw the line between the hexagon you&#8217;re hovering over and the closest clinic to provide some more descriptive interaction.
The heatmap was created through ArcGIS from census tract data. We filtered for just the number of women of reproductive age, 15 to 44, per tract and then used the Create Random Points function in ArcGIS to create one point for every 210 women. We came up with the 210:1 ratio by looking at a histogram of the data to see what would be an accurate dividing point. For a shameless plug, I used an online tool that I made called www.Histagram.me to generate quick, interactive histograms. Feel free to use it too.
Because the heatmap itself is done with CartoCSS layering techniques and not a statistically calculated heatmap, we made sure to compare side-by-side with a choropleth tracts map of the same data using Jenks-clustered color breaks to make sure that our heatmap told the same story as the choropleth. 
A few months ago we spoke with Andrew Hill, Senior Scientist at Vizzuality (who makes CartoDB) on some experimental ways to map the data. The line on hover came out of some of his renderings and you can see in the photos below some of the experimental line styles.
All in all it was a lot of team work, Allison, Abby, Brian, Caitlin, Lizzie, Sam and a number of other people all helped with parts of it over the course of six months. If you have any other questions about it, let me know at michael.keller@newsweekdailybeast.com
-Michael
Before we settled on the Value-by-alpha approach for showing the different state laws, some failures:
We tried outlining the different shapes and showing them in different colors:




We tried coloring the hexagon outline by the different laws that were in effect. Creating a sensical hierarchy proved difficult:


Lines instead of hexagons:
Highlighting Peurto Rico:

A value-by-alpha chart where census tracts are shaded by their percentage of women of reproductive age. Unfortunately, it&#8217;s not that intelligible and the heat map overlay is a much cleaner way of showing this relationship:

Before we made the hexagon grid, how the map looks if you use census tracts:

Last month we published a package of stories marking the fortieth anniversary of the Roe v. Wade decision. It had a few moving parts but I’ll just go over some of them briefly here.

How it started

This summer you probably heard the story about the last abortion clinic in Mississippi that was threatened to close due to stricter state laws. Allison Yarrow, who sat across from me at the time, was covering the story and it got us thinking: the line “The Last Abortion Clinic in Mississippi” is attention grabbing, but it doesn’t tell the whole story. That is to say, what you really want to know is how far are people away from their nearest clinic, regardless of state boundaries. One state may have five clinics but if they’re all in the southwest corner of the state and you live in the northeast corner, and your adjoining states have multiple clinics but only at their borders farthest from you, then you’ll have a hard time getting to a clinic, even if you had many in your state. To see where this might be the case and where access to services was compounded by new restrictive provisions (over 150 nationally in the past two years) we made as close to a comprehensive database as possible of every abortion clinic. Our goal was to see what parts of the country were farthest from a clinic. From start to finish, this process took about six months. 

We got our address data from a variety of publicly available sources: Planned Parenthood, the National Abortion Federation, anti-abortion websites that keep their own lists and others. We needed to verify that the address information was correct, though, so we called over 750 clinics to confirm. We also asked them up to how many weeks they offer services. The resulting database is the only one of its kind that we know of. The Guttmacher Institute undertook an abortion provider census in 2008 but they didn’t separate clinics from hospitals from private doctors offices, which represent different levels of care that we thought was an important distinction.

What it became

We started this in July and the project evolved. We thought the election might bring the issue of abortion access to the fore but it didn’t and that gave us more time. Allison brought up the fortieth anniversary of Roe v. Wade and that let us think much bigger about the project. Because this was such a personal subject matter, we knew readers’ comments would feature prominently (from both sides of the issue) and we wanted a strong narrative component, too.

To give a human voice to the Geography of Abortion Access map, Allison flew to Wichita, Kansas, one of the areas that stood out both on our map, as a metro area far from a clinic, as well as in recent memory as the site of the 2008 murder of late-term abortion provider George Tiller. To add a broader perspective, Sam Register who runs the Newsweek Archivist tumblr went through the Newsweek archives so people could follow the topic’s coverage from the 70s through the 00s.

What we learned from reader’s stories

Over the course of the week, we shifted the question we were asking from why do you support or oppose legal abortion to a conversation about pro-life and pro-choice labels as a way to get more nuanced opinions and show the complexity of the issue. We asked readers to complete either the phrase “I’m pro-life but…” or “I’m pro-choice but…” We got more responses from our other reader-based projects but we were happy in how thoughtful and honest people were. Read our roundup of interesting responses to those questions as well as our free form “Tell us your story” prompt here.

Under the hood on the map

How to represent this dataset was tricky. We had three main issues: anonymity, unbiased geography, and context. 

Anonymity: Although we got our data from publicly available websites that anyone could find and was often information that anti-abortion groups already held, we weren’t comfortable publishing addresses, names, or exact latitudes and longitudes. We took great care to do things like scrub our final database of anything identifiable and we partially randomized each clinic’s location so they weren’t pinpoint-able from our map. On the presentation level, we added the magenta circle big enough to span multiple hexagons (our base geography layer) to let people know that an address was approximate. Even if you backtrack and find our database, you won’t get any information that would let you de-anonymize the data.

Unbiased geography: As I wrote above, we wanted to get away from the arbitrary state and county borders that most all of the research we encountered was based on. We did some initial plots using Census tracts but that presents exactly the same problem [photo]. We ended up making a hexagonal grid using the Repeating Shapes plugin for ArcMap, which lets you make a grid out of your choice of shape and size. The trick to making a hexagonal grid for the web so that the hexagons will be regular (all sides equal) no matter what degree of latitude they fall on is to make the grid in your output projection, Web Mercator EPSG: 3857. You can reproject it to do your analysis in whatever you like, but because it will eventually be displayed in Web Mercator, it will need to be created in that so as not to come out distorted in the browser. If you want a 20,000 meter in diameter hexagonal grid, here’s the one we used:  ShapefileKML, GeoJSON.

And here’s another one that Brian Abelson, current Knight-Mozilla Fellow at the New York Times, made while he was helping out on the project. They are also 20,000 meter hex grids. This one has the state borders preserved in case you want to assign state values to each hexagon: Shapefile, KML, GeoJSON.

Context: Generating our distance map wasn’t enough to tell a story with. We added three other pieces of information that would walk people through the significance of the patterns they were seeing. The first was a map of female population aged 15-44 so that people could see the areas where women lived that were farthest away from clinics and identify significant metro areas (the pink dot density overlay). The second was the different legal restrictions that each area was subject to (areas with highlighted transparency). Again, this was an interesting way to visualize this data because we didn’t highlight every hexagon in Kansas, for example, to show that certain laws were applicable in Kansas. Instead, we highlighted hexagons whose closest clinic was in Kansas. This gave us a very realistic map so that people could see what state laws they would be subject to if their nearest clinic was across state lines. It also visually demonstrates how state laws can affect people that don’t live in that state. And third, we selected our own highlights from going through the data, such as the areas where telemedicine is banned in conjunction with mandatory in-person counseling. The combination of these laws in Arizona, for instance, means some women travel over a hundred miles and spend two days to get a prescription for the abortion bill. 

More under the hood

The map itself we built using CartoDB, which allowed us to very flexibly add the different highlighted views of the map without rebaking our tiles each time.The slider that shows clinics that only offer services up to X weeks we did by loading four tile layers on top of each other at once and show/hiding them depending on the slider value. This made the map slightly slower on initial load but it made the transitions between map states super fast — so a trade-off. 

For the highlighted states, those restyle and reload all four map layers as well. We used Leaflet.js’s ability to plot vectors to draw the line between the hexagon you’re hovering over and the closest clinic to provide some more descriptive interaction.

The heatmap was created through ArcGIS from census tract data. We filtered for just the number of women of reproductive age, 15 to 44, per tract and then used the Create Random Points function in ArcGIS to create one point for every 210 women. We came up with the 210:1 ratio by looking at a histogram of the data to see what would be an accurate dividing point. For a shameless plug, I used an online tool that I made called www.Histagram.me to generate quick, interactive histograms. Feel free to use it too.

Because the heatmap itself is done with CartoCSS layering techniques and not a statistically calculated heatmap, we made sure to compare side-by-side with a choropleth tracts map of the same data using Jenks-clustered color breaks to make sure that our heatmap told the same story as the choropleth. 

A few months ago we spoke with Andrew Hill, Senior Scientist at Vizzuality (who makes CartoDB) on some experimental ways to map the data. The line on hover came out of some of his renderings and you can see in the photos below some of the experimental line styles.

All in all it was a lot of team work, Allison, Abby, Brian, Caitlin, Lizzie, Sam and a number of other people all helped with parts of it over the course of six months. If you have any other questions about it, let me know at michael.keller@newsweekdailybeast.com

-Michael

Before we settled on the Value-by-alpha approach for showing the different state laws, some failures:

We tried outlining the different shapes and showing them in different colors:

image

We tried coloring the hexagon outline by the different laws that were in effect. Creating a sensical hierarchy proved difficult:

Lines instead of hexagons:

Highlighting Peurto Rico:

A value-by-alpha chart where census tracts are shaded by their percentage of women of reproductive age. Unfortunately, it’s not that intelligible and the heat map overlay is a much cleaner way of showing this relationship:

Before we made the hexagon grid, how the map looks if you use census tracts:

After the Newtown shooting in December, we had a meeting over the phone to discuss our coverage. We decided to have a two speed approach: a quick reader-driven story about why they do or don&#8217;t own guns (which we&#8217;ve written about a bit on this blog), and a deeper-dive look at the anticipated legislative issue that this and other recent shootings seemed to be bringing about, which we launched Monday as www.ThisIsYourRepOnGuns.com. The project idea grew out of the simple problem that not many people can name their representatives off the top of their head, let alone know their exact stance on gun control or how to get in touch to make their voice heard.
Eliza Shapiro, Abby Haglage and Caitlin Dickson did some awesome reporting for all 530+ representatives, digging through their voting records and previous public statements to distill their position to one of four categories: Opposes reform, Supports reform, Swing vote, or Unclear. We kept track of the sources, too, so that we could present representatives&#8217; statements to the reader when the final thing was done. 
Brian Abelson was also around to rig together @RepsGunTweets (since renamed @YourRepsOnGuns), which served as both a tool to monitor reps&#8217; statements to see what category they fell into, as well as an open feed for anyone interested in the topic to follow on Twitter. Read about how that was built in this blog post.
The interactive currently stacks up the number of reps in each category and lets you do a combination filter by different criteria such as chamber, party and state. You can see things like how likely legislation is to pass each chamber and where different states stand. Importantly, too, you can put in your address read information on your House representative and two Senators. Using information compiled by the Sunlight Foundation, it gives you their phone, fax (for those that prefer the fax), address, twitter, website and Facebook page so you can get in touch with them. We also pulled in each representatives NRA grade and their rating from the Brady Campaign to Prevent Gun Violence to give more context to their legislative history.
My favorite part of it though, is that we&#8217;ll be updating it as the gun debate goes on. We&#8217;ve already received emails from readers who have contacted their reps with statements that we&#8217;ll add and one person sent us a local news story from their congressperson that will move him from the Oppose reform to a Swing vote. We&#8217;ll mark these updates on the landing page so people can follow along and readers can leave their email to be notified of updates.
We also did this as its own URL similar to how we did www.HavingTroubleVoting.com. As a resource and tool that was going to hopefully have a long life, we felt an easy to remember and dedicated page showed our readers that this was something they could keep coming back to.
Under the hood
The hardest part of this was getting all of the data from multiple different sources into one nice database. We had a few different people researching, different numbers coming in from different places, and multiple editors editing. We used Google Spreadsheets and good spreadsheet etiquette to make sure people were marking the categories the same way and joined them in R. 
To make the stance information simple to update, the map copies that information from the main table on load instead of storing it separately with the map data.
The main page uses Isotope.js, which we&#8217;ve used a bunch before. But this was a little tricky because we needed to sort them into four columns. Fortunately, there&#8217;s some crazy extension for Isotope that lets you do just that. The harder part was figuring out how to get it to display top to bottom instead of bottom to top. But buried in the &#8220;Tests&#8221; documentation was a page on how to make your elements stack right-to-left for languages like Hebrew and Arabic. It includes the settings to rotate the positioning, which worked.
The only fancy mapping feature is if you click on a district, the map automatically pans and zooms to fit the founds of that district. This is done using the ST_Envelope() function in PostGIS through CartoDB. ST_Envelope() returns the bounding box of a given feature which you can sent to Leaflet.js&#8217;s fitBounds() method to pan and zoom to that box. The only problem to be aware of is ST_Envelope() will give you an array of x and y values but fitBounds() is expecting the format to be in y then x (lat, then long). As long as you reorder the elements in your coordinate array, Leaflet will be happy.
Getting the aesthetics of the map right was a little tricky. I wanted to make sure that a highlighted feature&#8217;s outline appears above the other features but below its own fill so you get a bright white border and then a subtler inner border. If you follow the symbol drawing order and compositing option rules in CartoCSS it becomes manageable.
From the failures folder
Here&#8217;s what the original mock-up looked like, which we weren&#8217;t too far off from. I reworked the top nav hierarchy into two main buttons, added more color and turned the rep detail elements into three columns instead of rows so it was more compact and graphic.

-Michael

After the Newtown shooting in December, we had a meeting over the phone to discuss our coverage. We decided to have a two speed approach: a quick reader-driven story about why they do or don’t own guns (which we’ve written about a bit on this blog), and a deeper-dive look at the anticipated legislative issue that this and other recent shootings seemed to be bringing about, which we launched Monday as www.ThisIsYourRepOnGuns.com. The project idea grew out of the simple problem that not many people can name their representatives off the top of their head, let alone know their exact stance on gun control or how to get in touch to make their voice heard.

Eliza Shapiro, Abby Haglage and Caitlin Dickson did some awesome reporting for all 530+ representatives, digging through their voting records and previous public statements to distill their position to one of four categories: Opposes reform, Supports reform, Swing vote, or Unclear. We kept track of the sources, too, so that we could present representatives’ statements to the reader when the final thing was done. 

Brian Abelson was also around to rig together @RepsGunTweets (since renamed @YourRepsOnGuns), which served as both a tool to monitor reps’ statements to see what category they fell into, as well as an open feed for anyone interested in the topic to follow on Twitter. Read about how that was built in this blog post.

The interactive currently stacks up the number of reps in each category and lets you do a combination filter by different criteria such as chamber, party and state. You can see things like how likely legislation is to pass each chamber and where different states stand. Importantly, too, you can put in your address read information on your House representative and two Senators. Using information compiled by the Sunlight Foundation, it gives you their phone, fax (for those that prefer the fax), address, twitter, website and Facebook page so you can get in touch with them. We also pulled in each representatives NRA grade and their rating from the Brady Campaign to Prevent Gun Violence to give more context to their legislative history.

My favorite part of it though, is that we’ll be updating it as the gun debate goes on. We’ve already received emails from readers who have contacted their reps with statements that we’ll add and one person sent us a local news story from their congressperson that will move him from the Oppose reform to a Swing vote. We’ll mark these updates on the landing page so people can follow along and readers can leave their email to be notified of updates.

We also did this as its own URL similar to how we did www.HavingTroubleVoting.com. As a resource and tool that was going to hopefully have a long life, we felt an easy to remember and dedicated page showed our readers that this was something they could keep coming back to.

Under the hood

The hardest part of this was getting all of the data from multiple different sources into one nice database. We had a few different people researching, different numbers coming in from different places, and multiple editors editing. We used Google Spreadsheets and good spreadsheet etiquette to make sure people were marking the categories the same way and joined them in R. 

To make the stance information simple to update, the map copies that information from the main table on load instead of storing it separately with the map data.

The main page uses Isotope.js, which we’ve used a bunch before. But this was a little tricky because we needed to sort them into four columns. Fortunately, there’s some crazy extension for Isotope that lets you do just that. The harder part was figuring out how to get it to display top to bottom instead of bottom to top. But buried in the “Tests” documentation was a page on how to make your elements stack right-to-left for languages like Hebrew and Arabic. It includes the settings to rotate the positioning, which worked.

The only fancy mapping feature is if you click on a district, the map automatically pans and zooms to fit the founds of that district. This is done using the ST_Envelope() function in PostGIS through CartoDB. ST_Envelope() returns the bounding box of a given feature which you can sent to Leaflet.js’s fitBounds() method to pan and zoom to that box. The only problem to be aware of is ST_Envelope() will give you an array of x and y values but fitBounds() is expecting the format to be in y then x (lat, then long). As long as you reorder the elements in your coordinate array, Leaflet will be happy.

Getting the aesthetics of the map right was a little tricky. I wanted to make sure that a highlighted feature’s outline appears above the other features but below its own fill so you get a bright white border and then a subtler inner border. If you follow the symbol drawing order and compositing option rules in CartoCSS it becomes manageable.

From the failures folder

Here’s what the original mock-up looked like, which we weren’t too far off from. I reworked the top nav hierarchy into two main buttons, added more color and turned the rep detail elements into three columns instead of rows so it was more compact and graphic.

-Michael

Notes and images from an ever-growing digital newsroom.

Newsweek & The Daily Beast

Contributors:
Brian Ries & Sam Schlinkert

Formerly:
Michael Keller, Andrew Sprouse, Lynn Maharas, & Clarisa Diaz

view archive