SU-10A Speech recognition board

Go To Last Post
21 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hello everybody ... I found this board via Google, and a site "describing" it. But there is no contact address and all links lead to pages in ... Chinese ! Does anybody have more informations (in English smiley ) ? Thanks for any answer ...

This topic has a solution.
Last Edited: Sat. Jun 11, 2022 - 03:38 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Never heard of this board before. I can see a video demo here: https://www.cnx-software.com/202...

Used the elechouse voice recognition module  earlier. Wasn't quite impressed with that.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

How well do you speak Chinese?  devil

 

jim

 

FF = PI > S.E.T

 

This reply has been marked as the solution. 
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The demo video remembers me the demo version of the HLK-V20. About Elechouse : the V3 (with training) rejects too many words, threshold is not adjustable ; the SimpleVR (speaker independent) works "rather well", with vocabularies of 4 words maximum, VERY different, and selecting the vocabulary according to the context. Unfortunately, on the other side, it seems rather sensitive to speaking "speed" for longer expressions, so it recognizes another one from the set (If one says "continuous turn" too fast, it is recognized as ... "interrupted", which is the OPPOSITE ! ). Maybe that, if I would use "continuous turn" and "interrupted itinerary" ... ? But I wonder what would happen if by mistake I would say "continuous itinerary" or "interrupted turn" ... And, with such long sentences, it's easier to push buttons !

 

My final conclusion : SPEECH RECOGNITION IS CRAP !

 

smiley

 

Last Edited: Thu. Jun 2, 2022 - 06:43 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Here is a description in English? Have you already read this?

https://www.aliexpress.com/item/...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yes, a hardware description ... but uselesss : no protocol, no utility to customize the vocabulary. A video sequence in Chinese ! A screen copy here ... still with Chinese texts. And nothing on GitHub. I wonder if that module can be programmed to understand commands in another language ?

Last Edited: Fri. Jun 3, 2022 - 03:50 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This reminds me of Big Bang’s Kripke complaining about Siri, https://www.youtube.com/watch?v=...

Enjoy!

jim

 

FF = PI > S.E.T

 

Last Edited: Fri. Jun 3, 2022 - 06:08 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Artificial intelligence < natural stupidity.

 

devil

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Finally, does anybody know a REALLY RELIABLE, OFFLINE module (No Alexa, etc, or smartphones please) ? Meanwhile I tested Mikroe's SpeakUp 2, it works rather well but rather slowly in my eyes, and I had issues with words where the vowel in the first syllabe is an "i", like "interrupted", "pittsfield", etc. The module may be speaker dependent or independent but should, unlike the Mikroe's, accept the connecting of a headset (maybe this could solve the issue with the "i" ). It should recognize at least English and not be Arduino dependent (like Audeme's MOVI, which is rather slow, too, as I saw on YouTube). Please relate me Your experiences, mine were rather NEGATIVE, no module matched my requirements !

Last Edited: Fri. Jun 10, 2022 - 04:11 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

lemiceterrieux wrote:
(No Alexa, etc, or smartphones please)
Did you ever wonder why all the successful speech recognition solutions do involve server processing? Before things like Siri, Google Assistant, Alexa, etc. speech recognition was a bit of a joke because it's almost impossible to do it in the limited power of a portable, battery powered device.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0


 

Yes you can do it with a portable product with high real-time detection reliability---your board will probably look like this:

 

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

Last Edited: Fri. Jun 10, 2022 - 04:50 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

In fact, there are such boards (a little bit smaller), made by a German company ... for about 170 Euro !

 

sad

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

In the late 1970's I worked at a company that made speech recognition products. The devices were 3U (5U?) rack mount. They had either an embedded DEC LSI-11 or Data General Nova minicomputer. Also two (or three?) of their own custom boards (about 12" x 12") full of analog computing modules. You also had to train the system to recognize your voice.  

 

They definitely cost more than 170 in any currency!

 

Code was written in assembly and burned to 8 by 256 (?) DIP PROMs.  My project was a voice controlled wheelchair for the US Veteran's Administration. 

 

https://patents.justia.com/assig...

https://ntrs.nasa.gov/citations/...

 

Mike

When you're used to privilege, equality feels like oppression. / Malena Ernman

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
Did you ever wonder why all the successful speech recognition solutions do involve server processing?
Some 16-bit MCU have I2S and software codecs (encode and stream to the server); some (a few?) 8-bit MCU should be able to stream at least telephony quality.

clawson wrote:
... because it's almost impossible to do it in the limited power of a portable, battery powered device.
What's the battery's size and weight?  (context : smart phone)

 


Vosk Installation

Android build

...

 

[1/3 page]

Websocket Server and GRPC server

...

due to CMU Sphinx Downloads – CMUSphinx Open Source Speech Recognition

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

 The devices were 3U (5U?) rack mount. They had either an embedded DEC LSI-11 or Data General Nova 

It's amazing how far things have come, but yet in some cases too much good work gets tossed aside or forgotten.  People may have developed efficient algorithms out of necessity & now they get replaced by some slop, because we can run a processor at 10 GHz with 400 Gig of memory.  So we may have a system 10000 times more powerful, but only get a 20x improvement, because we become much more wasteful (or less efficiency-minded).  

When in the dark remember-the future looks brighter than ever.   I look forward to being able to predict the future!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
Did you ever wonder why all the successful speech recognition solutions do involve server processing? Before things like Siri, Google Assistant, Alexa, etc. speech recognition was a bit of a joke because it's almost impossible to do it in the limited power of a portable, battery powered device.

 

All these systems like Alexa and the Google equivalent are fine ... for hackers, who can hear everything You say at home. NSA and FSB are listening ...

 

smiley

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

And this should be a almost "professional" (at least concerning the price : in Germany 119 Euro for the "Stamp", 238 Euro for the complete development system) offline system : look after 8:30, for me it's a trigger of "uncontrollable laughing" ! Can computer systems be deaf, too ? Finally it's easier to press a button instead of having to repeat a command several times (especially if it could be an "emergency" command). Finally, I think these systems are only suitable for "hardcore nerds" ... independently of the price they cost.

 

no

 

 

 

Last Edited: Sat. Jun 11, 2022 - 09:19 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

gchapman wrote:

What's the battery's size and weight?  (context : smart phone)

Sort of irrelevant. When you use Siri, Google Assistant, Alexa on your phone it's still sending the samples up to powerful server farms to do the grunt work. 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Rough estimate is one 18650 cell could power an Arm Cortex-A7 APU for one labor shift; a guess on the compute power as Vosk may need AArch64 (arm64)

Context : Certain industrial environments with possibly explosive atmospheres therefore no cellular and no Wi-Fi.

 

STMP15X-SOM (Olimex)

New 1 GHz SAMA7G54 is the First Single-Core MPU with MIPI CSI-2 Camera Interface and Advanced Audio Features | Microchip Technology

Power Sources | SAMA7G54-EK User's Guide (a guess is one watt)

Vosk Installation

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yesterday I had to call the French postal service concerning a delivery. I was asked to dictate a tracking code, which begun with "CY". I must explain that in French "Y" is spoken as "I-grec" (Greek I). I said the code and got the answer : "Did You say CI ? " ... "No" ! "Please say the first two characters of the code" ... "CY". "Did You say CI ? " ... "No" ! "We are sorry, we can't understand Your code ; look first for Your correct code. Or do You wanna speak to an operator ? ". No comment. And I think this prog is running on a mainframe, like Siri, Alexa, etc !

 

devil

Last Edited: Sat. Jun 25, 2022 - 02:31 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Meanwhile I found three other systems :

- Audeme MOVI, where the issue signalled on page 20 of the manual (the echo in the room) makes me suspicious, and for which there is no protocol available in order to use it with another system than Arduino or another language but C (I program in GC Basic).

- A module called "Sugar ASR Kittenbot" : I asked via mail for the reaction time and never got an answer (maybe this was THE annoying question ? )

- And a rather expensive German system, which already fails in the demo on Youtube (command rejected).

I tested (and bought) 6 various modules (since 2010 ! ), none worked in a for me satisfying way (but very well for the vendor's cashflow), and now implicitely three others (fortunately without buying).

No comment ...