Home > Keyboarding Theory, Keyboards > Typing Data: Preliminary Analysis

Typing Data: Preliminary Analysis

I have collected a large quantity typing data using Amphetype, on both QWERTY and MTGAP 2.0 (the two layouts that I currently know). I do not have any conclusive results, but I have some interesting data that I thought worth sharing.

My most interesting discovery is that there is a statistically significant correlation between frequency of a trigram and the average speed at which it is typed. On MTGAP 2.0 the correlation is 0.34 and on QWERTY it is 0.33. This means that a trigram’s frequency accounts for about 10% of the variation in typing speed—not a lot, but still enough to merit consideration.

Then I analyzed the speeds of various key combinations. For example, on MTGAP 2.0, the average speed for a trigram containing an inward roll is 121 words per minute (wpm); for a trigram containing an outward roll, the average is 110 wpm, and for a trigram containing neither, it is 111 wpm.

When all three keys are typed with one hand, the average is 104 wpm; when two are typed with one hand and one with the other, the average is 118 wpm; and where the hand alternates, 107 wpm.

Where the total finger travel distance is short, the average is 120 wpm; for medium distance, 111; and for a long distance, 105.

It would be premature to draw conclusions from these data. For example, the reason why short finger travel distance is faster may be because MTGAP 2.0 intentionally places common keys on the home row, and common keys tend to be typed faster. On QWERTY, the average speeds for short, medium, and long distance are 96, 102, and 104 wpm, respectively. In this case, the short-distance keys are the slowest.

I am currently looking for anyone who uses Amphetype or is willing to contribute some time to using it. I want to get as much typing data as possible, especially on a variety of keyboard layouts. Leave a comment if you are interested.

For those who are interested, here are all the data I have acquired.


MTGAP 2.0

Average WPM: 112

near distance average: 120
medium distance average: 111
far distance average: 105

inward close keys average: 120
outward close keys average: 107
not close keys average: 110

in roll average: 121
out roll average: 110
not roll average: 111

same hand average: 104
two and one average: 118
alternation average: 107

triple finger average: 73
same finger average: 91
different finger average: 115

twice jump average: 73
home jump average: 92
home jump index average: 113
not jump average: 112

twice to center average: 105
to center average: 116
not to center average: 112

QWERTY

Average WPM: 104

near distance average: 96
medium distance average: 102
far distance average: 104

inward close keys average: 106
outward close keys average: 113
not close keys average: 102

in roll average: 104
out roll average: 116
not roll average: 102

same hand average: 98
two and one average: 106
alternation average: 105

triple finger average: 72
same finger average: 87
different finger average: 107

twice jump average: 71
home jump average: 82
home jump index average: 116
not jump average: 104

twice to center average: 100
to center average: 108
not to center average: 102

Advertisements
  1. Tim Johnson
    November 29, 2011 at 10:33 am

    It’s great to see you collecting empirical data for keyboard optimization. However, your fully optimized keyboard posted on Jan 16 has a couple problems which the particular data you’re collecting won’t reveal.

    1. It doesn’t seem to take into account Malt’s findings:
    “On the qwerty layout the highest source of error is on the vowels e and i. On the Linotype layout it is on the vowels a, i, and o and Dvorak 9 himself detected that of 3,329 errors analysed for the ‘Simplified’ keyboard 1,631 involved the vowels. The five vowel keys accounted for 48.99% of the errors analysed. (Figures abstracted from chart on p 504). If vowels are strategically placed so that they do not appear on adjacent keys, nor on the same finger and same row of the two hands, neural confusion may be avoided.”
    http://www.ergo-comp.com/articles/keyboarddesign.html

    2. For your Kinesis layout, you put the space key under the right thumb, as usual, but you put nothing under the left thumb, “for aesthetic reasons”. Maltron gains a significant advantage by putting a letter (e) under the left thumb; if you use the same strategy, maybe your program can come up with an even better layout than the Maltron. (The factory default Kinesis layout uses that key for backspace, but backspace can be moved elsewhere, such as to the default position of delete, in order to make room for a letter.)

    With these changes, what layout do you get for a full Kinesis optimization?

    • November 29, 2011 at 3:06 pm

      1. I had not previously read those findings. I agree that it is important to reduce errors, but only because doing so increases typing speed. Typing speed, not accuracy, is the ultimate goal. A measurement of typing speed already takes errors into account, so if we have data on speed, we shouldn’t have a need for data on accuracy.

      Even so, according to my data, the vowels are actually more accurate than the other letters—on MTGAP 2.0, three of the five most accurate letters are vowels, and on QWERTY all five of the top five are vowels. I acknowledge that this result may be due to some personal bias, but it is interesting that it so significantly contradicts Malt’s findings.

      I have not yet modified my keyboard layout optimizer to account for the data published in this post, as I do not yet have enough information to properly interpret the data. I want to collect some statistics from a variety of keyboard layouts, which is why I am looking for people who use layouts besides QWERTY.

      2. My current design does not include thumb pads, but I intend to place space and enter on the right thumb, and shift and backspace on the left. That makes good use of each of the thumbs.

      If I were to include backspace in my keyboard optimizer, I would have to get data on when and how frequently it is typed. A normal text corpus does not include that kind of data.

  2. Tim Johnson
    November 29, 2011 at 9:14 pm

    “I agree that it is important to reduce errors, but only because doing so increases typing speed. Typing speed, not accuracy, is the ultimate goal.
    You’re underestimating the advantages of being able to type with a very low error rate. Speed is only one advantage. For type-A personalities, it’s also less stressful than having to make frequent corrections. And for situations where a small numbers of errors are tolerable, it enables rapid typing without bothering to review the typed text at all. And if the error rate is small enough, it justifies moving the backspace key to a less-prominent position, to make room for more-frequently used keys such as “e” and shift, and maybe even tab, ctrl, alt, and win.

    “I acknowledge that this result may be due to some personal bias, but it is interesting that it so significantly contradicts Malt’s findings.”
    It not only contradicts Malt’s own findings (about qwerty and linotype), but also Dvorak’s findings (about his own layout), as Malt referenced.

    “I intend to place space and enter on the right thumb, and shift and backspace on the left. That makes good use of each of the thumbs.”
    Fortunately Kinesis provides more than two keys for each thumb. At least three for each thumb are fast and easy to press, so you could fit a letter, ctrl, and backspace all under the left thumb, and space, enter, and shift all under the right thumb.
    It would be useful to see the layout which your program produces with these changes, even if you don’t intend to use it, for the sake of seeing how closely it matches the layout which Malt produced by hand, and for the sake of us who want to use a computer-optimized layout which takes into account all of the published research findings. I spent about an hour wrestling with your program just trying to enable a letter under the left thumb, but I was unsuccessful; hopefully it will be easy for you.

    “If I were to include backspace in my keyboard optimizer, I would have to get data on when and how frequently it is typed. A normal text corpus does not include that kind of data.”
    Computer QWERTY Keyboard Key Frequency:
    Space e t Shift a o i n s r h Del l d c u Enter
    m f p g w y b , . v k ( ) _ ; ” = ‘ – Tab x / 0 $ *
    1 j : { } > q [ ] 2 z ! < ? 3 + 5 \ 4 # @ | 6 & 9 8 7 % ^ ~ `
    http://letterfrequency.org/

    That doesn't distinguish between delete and backspace, but here's some data which does:
    http://xahlee.org/emacs/command-frequency.html
    Backspace is 1.62% of the total, delete is 0.45%, and data entry is 46.81%. The letter "e" is 12.7% of all letters (http://en.wikipedia.org/wiki/Letter_frequency); multiplying that by 46.81% gives 5.94% of the total, which is slightly inaccurate because Xah's data includes letters, digits, and punctuation as data entry while the Wikipedia data includes only letters, but the result still shows that "e" is far more frequent than backspace.

    • November 29, 2011 at 11:29 pm

      What is a situation where few errors are tolerable?

      My program currently doesn’t have thumb keys. I plan on adding those in, but the thumb pads contain many keys that are hard to measure such as ctrl and backspace.

      It is possible to estimate how frequently backspace is typed, but that doesn’t tell me how frequently it is typed before or after other keys. I have some typing data that I have collected, but the frequency with which I mistype certain characters depends on what layout I’m using so I have no way of getting accurate general statistics. There are many factors that determine how frequently a character sequence is mistyped, and those factors change significantly when you switch to a new layout.

      All else being equal, it is probably better to not place vowels close together. However, I do not know how much of a difference that makes, or whether it is worth it.

  3. November 30, 2011 at 1:17 am

    Hi Michael.
    Glad to see your KB development is still active. The following idea stems from my difficulty in compiling the layout generator.

    I would urge you to devise a web-client for Amphetype. This way, non-technical seekers of truth could do everything through your site.
    — register with a valid email address
    — read a page describing 12 high-scoring evolved layouts (don’t overwhelm them)
    — pick one and test-drive it in the Amphetype-client

    The less tech-savvy public would suddenly gain access to this wonderful corner of the universe. You would get more stats and a clearer idea of people’s aesthetics. Requiring users to login means that they will get a simple progress report while you will get full stats. If users decide to try another layout, they could do so within Amphetype-client. They would be limited to the shortlist of 12 layouts which keeps it simple and ensures a useful volume of stats for each.

    With an attractive UI and some marketing in academic circles, It could be big. I think 14-24 is the elastic-brained age bracket to target.

  4. Tim Johnson
    November 30, 2011 at 10:11 am

    “What is a situation where few errors are tolerable?”
    Informal transcriptions of verbal speech can tolerate some errors. If the errors are few, then they can be left in place without causing significant distraction when read later, but not if the errors are many. A transcriptionist working in real time can gain some speed (which might make the difference in being able to keep up with the speech) by ignoring all errors as he types, and not even reviewing the text as he types, but ignoring all errors is only a viable option if the error rate is low enough. A typist who can type at 120wpm with corrections, or 180wpm without corrections, but has a high error rate, is useless for real-time transcription, but a typist who can type at 150wpm without corrections but with a low error rate is useful.
    If subjected to automated spelling correction without human supervision, there’s risk of incorrect “corrections” to text unless the spell checker is highly intelligent (for example, an ordinary spell checker doesn’t know whether “niw” was supposed to be “now” or “new”), which is why if human supervision can’t be afforded, it’s typically better not to run a spell checker at all. The state of the art in speech recognition software also isn’t satisfactory.
    But I think the stress reduction for ordinary typing is the more important issue. The need for frequent error correction is annoying, and the need to review your text as you type is a distraction.

    “My program currently doesn’t have thumb keys. I plan on adding those in, but the thumb pads contain many keys that are hard to measure such as ctrl and backspace.”
    For now, just considering the home keys for the thumbs (backspace and space by default on Kinesis), almost certainly space is optimal for the right thumb, so you don’t have to bother to change your program to verify that fact, and if a letter is optimal for the left thumb, almost certainly that letter is “e”. I just now realized that this means you could manually assign “e” with this assumption, and not have to change your program at all, just replace every occurrence of the letter “e” in your text corpus by a space; then, running your program will generate the optimal layout with the constraint that “e” and space be under the thumbs.
    With regards to backspace, I think it’s obvious that both it and shift should be under the thumbs, just as it’s obvious that space and enter should be under the thumbs. You don’t need a computer program to tell you this. Since you have at least 6 easily-accessible keys total under the thumbs, that leaves room for two more functions, most likely the letter “e” along with either another letter, or tab, or ctrl. Assuming it isn’t another letter, the only question then is which thumb keys in particular are used for which of those functions, and a computer-generated assignment is unlikely to be a significant improvement on a straightforward manual assignment.

    “All else being equal, it is probably better to not place vowels close together. However, I do not know how much of a difference that makes, or whether it is worth it.”
    Well, generate a layout with the constraint that the vowels be unclustered, and see how it scores compared to your current clustered-vowel layout. If the difference isn’t significant, then there’s no reason not to uncluster the vowels, considering the research which advises unclustering them. If unclustering them results in significantly worse scores, then it might be worthwhile to keep them clustered.

  5. Patience
    December 1, 2011 at 2:15 am

    The problem with having space and shift on the same thumb is that you will have same finger for thumb in *every word that starts with a capital*. And having tried this already, I can say that same finger for thumb is more uncomfortable than any other same finger.

  6. Patience
    December 1, 2011 at 2:18 am

    Have any of you looked into Plover? Here’s the website: http://plover.stenoknight.com/

  7. mothersdevotion
    April 3, 2012 at 4:55 pm

    I am using Dvorak layout(transitioning to it @ present) combined with 2 foot pedals to activate a 2nd and 3rd layout on the fly when those layouts are optimal. There is exciting data to be found here if you would like. Might have issues with Amphetype tracking appropriate keypresses with my setup:
    I have the footpedals sending ScrollLock and NumLock and a script with Autohotkey.com to remap my keys to my custom key layout(depending on which pedal is hit). Not sure if there is a way in Amphetype to differentiate here. Email me back if this would interest you.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: