shanen0

Q: Improving voice dictation results?

Recently bought a MacBook Pro (Retina) and working (or trying to work) with the voice dictation. However the quality of the results actually seems to be getting worse over time, whereas it's supposed to get better. I suspect that I am giving it confusing feedback, but can anyone explain how it is working or how it is supposed to work?

 

My original theory was that sometimes I am editing silently but with the dictation active, and it was interpreting that as incorrect feedback about recognition errors that weren't there. To prevent this, I tried making sure that the voice input was disengaged before making any editorial changes, but that did not seem to help.

 

The feedback coming from the Mac's side is also somewhat confusing, but it definitely seems as though the Mac is refining its interpretation for several seconds. After I have said a few more words, it often goes back and changes part of the text that was already displayed, so I think it is building better recognition results based on the additional context. Sometimes it winds up with ambiguous text in blue and the option to select the proper text, but it definitely seems that it offers that option less frequently these days, even though the quality of the transcription in that passage may be poor. (There are times when it seems to be working better, but I also can't find any pattern or speaking style for consistently better results. The average transcription quality seems to be near the breakeven point... Sometimes it is definitely faster than typing, but other times the corrections take longer than just typing it in the first place.)

 

Related topic, but sometimes I want to go back and change part of what I dictated by inserting new words in the middle, still using the dictation. However, right now I think that is a bad idea, and I am trying to avoid doing it. The Mac ought to be smart enough to figure out what is going on, and even to recognize the nearby content as relevant, but I guess not.

 

As a constructive suggestion for recognizing the words properly, it would be quite helpful if I could tell the Mac to play the sound it thinks it heard around some word so that I could correct the displayed words in accord with what I actually said. Perhaps with an explicit switch to a no-dictation editing mode for final polishing?

MacBook Pro with Retina display

Posted on May 27, 2015 7:26 PM

Close

Q: Improving voice dictation results?

  • All replies
  • Helpful answers

Page 1 Next
  • by shanen0,

    shanen0 shanen0 May 27, 2015 7:32 PM in response to shanen0
    Level 1 (14 points)
    Desktops
    May 27, 2015 7:32 PM in response to shanen0

    By the way, I've also asked this question to Apple's telephone support without getting anything useful. Ditto a local Apple store, but I didn't know I was supposed to get an appointment over the Web before going...

     

    At one point the phone people referred me to the manuals section of the website, and I also spent some time there, but could not find any manual that seemed to cover this question. If you can point me at an appropriate manual, it would help. Or maybe I need a book or manual targeted at Linux/Windows people entering the Mac world?

     

    My main reaction to my first month of Mac ownership is that "Think Different" must be a joke. "Think OUR Way" seems more plausible at this point. Sorry, but I will persist in thinking my own way.

  • by shanen0,

    shanen0 shanen0 May 27, 2015 10:34 PM in response to leroydouglas
    Level 1 (14 points)
    Desktops
    May 27, 2015 10:34 PM in response to leroydouglas

    Been there and suspect I have already found most of the similar pages.

     

    Perhaps it will help to clarify that I am not concerned with commands or meta-commands because of the modality problem. I don't think any of the systems are smart enough to distinguish between modes reliably, and I don't think I could even keep it straight enough to speak precisely about such things. My objective is simply voice input during bulk composition.

     

    What I'm doing now when I dictate, but with only limited success, basically involves thinking for a few seconds and then saying a sentence at moderate speed and with fairly careful pronunciation of each word. Quite often I am ready and start dictating the next sentence before the analysis has been completed on the previous sentence, but the Mac can usually keep up with me pretty well.

     

    I have done some experiments with just talking in a relaxed and informal way, but the results of that approach so far appear to be mostly unusable. At my normal speed, I'm lucky if it catches enough of the important words to allow me to reconstruct what I was thinking about, but my natural speaking speed is unusually rapid. However, this is where my suggestion might be especially helpful, because I am pretty good at recognizing my recorded words even at my fastest speaking tempos.

  • by rccharles,

    rccharles rccharles May 28, 2015 9:39 AM in response to shanen0
    Level 6 (8,471 points)
    Classic Mac OS
    May 28, 2015 9:39 AM in response to shanen0

    I wouldn't expect perfection in the area of voice recognition.  Computer Science researchers have been working in this area for fifty or more years.  It's not an easy area.  Everyone's voice is slightly different.  Words are context dependent.

     

    Example: wind

    http://dictionary.reference.com/browse/wind

     

    If dictation is your goal, you could try a commercial package. I suspect the authors of a commercial package would have spent more effort than apple on the package.  Would any diction application be good enough for you? Who would know.

     

    What about background noises?  Changing background noises can be a distraction.  A quality microphone might help.

     

    Robert

  • by shanen0,

    shanen0 shanen0 May 28, 2015 6:14 PM in response to rccharles
    Level 1 (14 points)
    Desktops
    May 28, 2015 6:14 PM in response to rccharles

    Actually I supported a research lab for many years, so I'm pretty up-to-speed on the state of voice recognition and transcript editing. Ever heard of a voice font? Not sure it ever went commercial, but Mel Blanc had scores of them in his head... I have experimented with conditions and various background noises, and also with my speaking position, but not with other microphones.

     

    Feels like I'm repeating myself, but I'm not expecting perfection. My original focal concern is whether or not my efforts to train the system are going in the wrong direction. My broader concern is with the tradeoff against typing. Even when I am typing there is a certain loss of efficiency for mistakes, but the lossage of the voice dictation should be as low as possible to save as much input time as possible...

     

    From Apple's description, they seem to think they are offering accurate recognition of natural and continuous speech of the conversational style. I know how difficult that is, but I am not trying to push it that hard. Perhaps my problem is that I'm fighting the system and Apple has truly optimized the parameters in that direction?

  • by shanen0,

    shanen0 shanen0 Jun 10, 2015 5:28 PM in response to shanen0
    Level 1 (14 points)
    Desktops
    Jun 10, 2015 5:28 PM in response to shanen0

    Eh? What is this auto-saved version? Seems to be the same as the one that was saved, but I can't figure out which I should delete, or if trying to delete either would destroy both... Ergo, I should just save this version and see what happens to the other one?

     

    Actually I supported a research lab for many years, so I'm pretty up-to-speed on the state of voice recognition and transcript editing. Ever heard of a voice font? Not sure it ever went commercial, but Mel Blanc had scores of them in his head... I have experimented with conditions and various background noises, and also with my speaking position, but not with other microphones.

     

    Feels like I'm repeating myself, but I'm not expecting perfection. My original focal concern is whether or not my efforts to train the system are going in the wrong direction. My broader concern is with the tradeoff against typing. Even when I am typing there is a certain loss of efficiency for mistakes, but the lossage of the voice dictation should be as low as possible to save as much input time as possible...

     

    From Apple's description, they seem to think they are offering accurate recognition of natural and continuous speech of the conversational style. I know how difficult that is, but I am not trying to push it that hard. Perhaps my problem is that I'm fighting the system and Apple has truly optimized the parameters in that direction?

  • by shanen0,

    shanen0 shanen0 Jun 10, 2015 5:42 PM in response to shanen0
    Level 1 (14 points)
    Desktops
    Jun 10, 2015 5:42 PM in response to shanen0

    Hmm... Okay, as I was about to report before the mysterious interruption of the auto-saved version, just a minor status report on this topic:

     

    The quality of the results from voice dictation now seem stable. They no longer seem to be getting worse, but neither are they getting better. My searches for more information on how it works or how to train it more effectively have all come up dry, including calls to Apple support. My various experiments all failed to produce clear results, at least over the time periods in which I tested various techniques or rules for dictation. I have become a little better using it, so I feel that there is usually a savings in time if I can dictate first, but the savings are marginal and I am not yet ready to assess the overall quality. Composing by speaking is more natural in some ways, but also affects the results, especially their structure and organization.

     

    In terms of the higher goal of learning to like the Mac, no progress. Neither do I dislike it. Perhaps I should envy people who still feel such enthusiasm for any computer, Mac or whatever? As a paying customer, I think I should have bought the Chromebook first, but I'm pretty well resolved to buy a Chromebook and see whether or not that is more satisfactory. (I still despise Microsoft, but I may take advantage of the supposedly free trial of Windows 10 to test if I should despise MS less than I currently think I do...)

     

    Narrow response to the Apple Support Communities is that my three attempts produced almost no useful information (excluding information that I had already found via other channels, which basically means searching the Web). I had high hopes for a pointer to some book like "How I Learned to Love my Mac"?

  • by Grant Bennet-Alder,

    Grant Bennet-Alder Grant Bennet-Alder Jun 10, 2015 7:34 PM in response to shanen0
    Level 9 (60,714 points)
    Desktops
    Jun 10, 2015 7:34 PM in response to shanen0

    The beauty of the Mac experience is that the interface is approachable for novice users, but can be speeded up (by learning the shortcuts) to service near-expert users as well. Don't know what you don't know? Just mouse around the Menus and read the major choices. Or as a near-expert, type the shortcuts for immediate results, mousing is not required.

     

    It is more approachable, more regular, and far more forgiving than other Operating Systems in use today.

     

    Compare and contrast unix, where if you make too many mistakes in typing the OS suggests you may need to take a nap. The tools there are extremely sharp, but the only thing novices seem to accomplish is getting themselves cut to ribbons.

     

    I did not say there is no learning curve -- of course there is one. and high-end software comes with an enormous learning curve. But more than most others, Mac OS X allows you to use the computer as a tool rather than an end in itself.

  • by shanen0,

    shanen0 shanen0 Jun 14, 2015 5:31 PM in response to Grant Bennet-Alder
    Level 1 (14 points)
    Desktops
    Jun 14, 2015 5:31 PM in response to Grant Bennet-Alder

    Another heavy weekend, and only limited time for wrestling with the voice input. No significant progress to report, I'm afraid.

     

    However, on the general interface topic, I think the future interfaces will be more flexible and adaptable to what the individual user wants to accomplish. As the work calls for new features, they should become accessible in accord with the way that person works. The idea of the monolithic "best" interface is a large part of the problems we're fighting, though I'm not sure who to blame... On the one hand, I feel like Apple may have started that arms race, but on the other hand, I feel MS made it profitable... On the third hand, I want to give some credit to Apple for rethinking things, even if they apparently remain committed to the ridiculous search for best. (Unfortunately, even if I know roughly what form the solution will have, I don't expect to live long enough to see it.)

  • by pinkypinky,

    pinkypinky pinkypinky Jun 27, 2015 12:10 PM in response to shanen0
    Level 1 (0 points)
    Jun 27, 2015 12:10 PM in response to shanen0

    I am currently using 10.10.3, on iMac. I had read that Yosemite could perform speech recognition offline, and Apple called it "enhanced" dictation...

    Well,the results are not perfect, but different from online dictation.

    The bad part of the business is that online dictation allow the text to be changed, because it underlines in blue the ambiguos words. Not the same happens using "enhanced" version! So it is impossible to teach the software! Also is impossible to add the missing words!

    Is this version "enhanced"? Wow, fantastic. But, excuse me, let me better understand: enhanced regarding...what?

    Apple support was not able to tell me a solution. Really, they looked like they weren't aware of the problem. Can't believe that nobody told to Apple about this issue...

  • by shanen0,

    shanen0 shanen0 Jun 27, 2015 5:40 PM in response to pinkypinky
    Level 1 (14 points)
    Desktops
    Jun 27, 2015 5:40 PM in response to pinkypinky

    Well, I think I have to make some allowances because your English is pretty clearly not one of the mainstream dialects. It differs to the point that it makes your meaning a bit unclear, but let me make sure I understand what you're saying in your conclusion: "Can't believe that nobody told to Apple about this issue..." I think you mean either "I'm surprised that other users are not complaining about this [voice dictation accuracy] problem" or "Hasn't anyone noticed this and asked Apple about it?"

     

    For either interpretation of your intention, I think the explanation has to do with the way Apple relates to the users, which I would actually describe by starting with a comment that a self-confessed fanbois wrote elsewhere on an unrelated topic: "It just works." Now he sees that as a good thing, but I'm seeing that as a problem because I'm not always willing to accept that the way it works is good enough... That's probably the real crux of my "problem" with the voice dictation results. Apple has said that this is how it works and they expect me to be satisfied with those results, whereas I want to improve the results and I think that understanding HOW it works would help me make it work better, possibly by adjusting the ways that I'm using it.

     

    Your [pinkypinky's] observation on the ambiguous word feature is interesting, and I will attempt to do some testing to confirm whether or not it is accurate. My initial feeling is that it probably isn't correct, since I'm usually connected to the network, even though I have installed the offline "enhanced" software modules for several languages. I also (obviously) agree with you that it would be good if Apple support at some level or path was able to address such "issues" before they become undeniable "problems".

     

    In my own case, I think the issue is already mooted. The result was that I mostly quit using the Mac and now regard it as an expensive mistake that will never justify the large investment. I'm probably going to give the machine to a friend and hope that person finds it more useful than I did.

     

    I want to attempt to speak in defense of Apple's support system because I really do feel like the people I spoke to on the phone really made a sincere effort to answer my questions, but they don't have access to the required data. I cannot speak in defense of the Apple store in Ginza, since my visits there were completely useless and did not even feel "sincere", but perhaps that was due to my own failure to make advance reservations over the Internet. However, my access to both of those options is exhausted now, even though I barely used them and certainly received no satisfaction via those channels. Again, that's my own fault for not giving higher priority to this matter during my 3-month new-owner period. I also spent quite a bit of time on the Apple website, and it was quite disappointing. Lastly, I should mention this discussion forum as part of Apple's support system, but that's kind of hard to assess... It seems obvious that Apple has provided incentives to encourage participation, but I didn't receive any useful answers to the tough questions. (Or maybe there were some and I just failed to recognize them?)

  • by Barney-15E,

    Barney-15E Barney-15E Jun 27, 2015 6:25 PM in response to pinkypinky
    Level 8 (49,784 points)
    Mac OS X
    Jun 27, 2015 6:25 PM in response to pinkypinky

    pinkypinky wrote:

     

    Is this version "enhanced"? Wow, fantastic. But, excuse me, let me better understand: enhanced regarding...what?

    It is "enhanced" in that your spoken words are not sent to Apple's servers for transcription. That bothers a lot of people. Enhanced Dictation uses a smaller database on the Mac to do the transcription.

  • by pinkypinky,

    pinkypinky pinkypinky Jun 27, 2015 6:48 PM in response to Barney-15E
    Level 1 (0 points)
    Jun 27, 2015 6:48 PM in response to Barney-15E

    Really? Well, maybe I am missing something. English is not definetely my own language, but I believed that "enhanced" was referring to a better behaviour, not simply a different one.

  • by Barney-15E,

    Barney-15E Barney-15E Jun 27, 2015 7:21 PM in response to pinkypinky
    Level 8 (49,784 points)
    Mac OS X
    Jun 27, 2015 7:21 PM in response to pinkypinky

    pinkypinky wrote:

     

    Really? Well, maybe I am missing something. English is not definetely my own language, but I believed that "enhanced" was referring to a better behaviour, not simply a different one.

    Words can now be interpreted however you decide. Just ask the US Supreme Court.

Page 1 Next