RSC-164
Recognition
•
Synthesis
•
Control
General Purpose Microcontroller Featuring Speech Recognition, Speech &
Music Synthesis, Speaker Verification and Audio Record/Playback
GENERAL DESCRIPTION
The RSC-164, from the Interactive Speech™ family of
products, is a low-cost microcontroller designed for use
in consumer electronics. The RSC-164 combines an 8-bit
microcontroller with high-quality speaker-independent
and speaker-dependent speech recognition, speech
synthesis, speaker verification, four-voice music
synthesis, and voice record and playback. Products can
use one or all of the RSC-164 features in a single
application.
The RSC-164 employs a sophisticated neural network
that learns to classify sound data. On-chip speech
recognition algorithms reach an accuracy of greater than
96% for speaker-independent recognition and greater
than 99% for speaker-dependent recognition. Sensory’s
neural network approach (patent pending) eliminates the
need for expensive signal processing or extensive RAM
storage.
The highly-integrated nature of the chip reduces external
parts count. A complete system may be built with few
additional parts other than a battery, speaker,
microphone, and audio input support circuitry. Low
power requirements make the RSC-164 an ideal solution
for battery-powered and hand-held devices.
FEATURES
Full Range of Speech Capabilities
•
Speaker-independent speech recognition
•
Speaker-dependent speech recognition
•
High quality speech synthesis and sound effects
•
Speaker verification
•
Four-voice music synthesis
•
Voice record & playback
Integrated Single-Chip Solution
•
4 MIPS 8-bit microcontroller
•
On-chip A/D and D/A converters, digital filtering
•
32kHz clock for time keeping
•
Internal 64kbytes ROM; 384 bytes RAM
•
16 general purpose I/O lines
•
External memory bus: 16-bit Address, 8-bit Data
•
On-chip output amplifier for direct speaker drive
Low Power Requirements
•
3.5 - 5.0V supply
•
~10mA operating
RSC-164 Block Diagram
From the
Interactive Speech™
Line of Products
RSC-164
DATA SHEET
Oscillator
Preamp and
gain control
Microphone
Multiplexer
ADC
Digital Logic
AGC
Microcontroller
DAC
RAM
ROM
AMP
Speaker
RSC-164
General
Purpose I/O
External
Memory
RSC-164 OVERVIEW
The RSC-164 is a member of the Interactive Speech™
line of products from Sensory. It features a high-
performance 8-bit microcontroller with on-chip A/D,
D/A, RAM and ROM. The RSC-164 is designed to bring
a high degree of integration and versatility into low-cost,
power-sensitive consumer applications.
Various functional units have been integrated onto the
CPU core in order to reduce total system cost and
increase system reliability without degrading system
performance. The RSC-164 delivers 4 MIPS of integer
performance at 14.32 MHz providing maximum
performance at minimum cost.
The CPU core embedded in the RSC-164 is an 8-bit,
variable-length-instruction,
microcontroller.
The
™
instruction set is loosely based on Intel’s 8051 , and has
a variety of addressing mode
mov
instructions. The RSC-
164 processor avoids the limitations of dedicated A, B,
and DPTR registers by having completely symmetrical
source and destinations for all instructions. The 384 bytes
of internal RAM are organized as a Register Space.
Consecutive entry
allows the chip to handle several voice
inputs in succession as long as each input is surrounded
by one-half second of quiet.
SPEECH AND MUSIC SYNTHESIS
The RSC-164 provides high-quality speech synthesis by
using a hybrid of a time-domain compression scheme
that improves on conventional ADPCM and a customized
reuse of sounds. Speech synthesis requires on-chip or
off-chip ROM to store audio sounds for synthesis.
The RSC-164 provides high-quality, low-cost four-voice
music synthesis which allows multiple, simultaneous
instruments for harmonizing. Music synthesis has low
ROM requirements - a 2-3 minute song requires under 5
kbytes of incremental memory. The RSC-164 uses a
MIDI-like system to generate music.
RECORD AND PLAYBACK
The RSC-164 can perform audio record and playback at
various compression levels depending on the quantity
and quality of playback desired. Data rates of under
14,000 bits per second are achievable while maintaining
very high quality reproduction. The RSC-164 also
performs silence removal to improve sound quality and
reduce memory requirements.
SPEECH RECOGNITION
The RSC-164 uses a neural network to perform speaker-
independent or speaker-dependent speech recognition.
Speaker-dependent recognition requires external memory
to store speech recognition information (e.g., SRAM,
Flash Memory).
Speaker-independent recognition
requires on-chip or off-chip ROM to store the words to be
recognized. The RSC-164 has several additional speech
recognition features as described below.
Continuous listening
allows the chip to continuously
listen for a specific word. With this feature a product can
be used in a normal environment and only “activates”
when a specific word, preceded by quiet, is spoken.
SPEAKER VERIFICATION
The RSC-164 can also perform text-dependent speaker
verification. After a speaker trains the chip on a specific
word, the chip is able to identify whether that word is
spoken by the original speaker, thus providing biometric
security.
POWER
2
From the
Interactive Speech™
Line of Products
DATA SHEET
RSC-164
The typical operating current is 10 mA operating at
14.32 MHz. Lowering clock frequency reduces power
consumption, although speech recognition requires a
14.32 MHz clock.
RSC-164 Architecture Diagram
AIN0
AIN1
SH
ADC
MUX
EXTERNAL
MEMORY
INTERFACE
A[15:0]
D[7:0]
-RDC
-WRC
-RDD
-WRD
ADC
DACOUT
DAC
ANALOG
CONTROL
REGISTER SPACE
384 bytes
BUFOUT
/PWM
PULSE WIDTH
MODULATOR
XI1, XO1
OSC1
INTERRUPT LOGIC
STACK SPACE
8 bytes
CPU
TIMER1
TIMER2
XI2, XO2
OSC2
INTERNAL ROM
32K x 8
HIGH
-XMH
-XML
LOW
32K x 8
TIMING AND CONTROL
P0.0-P0.7
PORT0
-RESET
-TE1/
PWM
BREAK POINT
REGISTER
P1.0-P1.7
PORT1
From the
Interactive Speech™
Line of Products
3
RSC-164
DATA SHEET
RSC-164 ARCHITECTURE
The RSC-164 is a highly integrated device that
combines:
•
•
•
8-bit microcontroller
On-chip ROM (64 kbytes) and RAM (384 bytes),
and the ability to address off-chip RAM or ROM
A/D converter and D/A converter
A microphone with an external preamp converts sound
into an audio signal that is fed to the RSC-164. The gain
of the external preamp may be controlled by the RSC-164
by using two of the I/O lines. The RSC-164 uses an ADC
(Analog-to-Digital Converter) to convert incoming
analog speech signal into digital data. The output audio
signal of the RSC-164 is derived from a DAC (Digital-to-
Analog Converter) or PWM (Pulse Width Modulator).
The RSC-164 has an external memory interface, with 16-
bit addresses and a 8-bit data buses, for accessing
external memory. It also has an internal ROM that can be
enabled or disabled (partially or fully) by pin inputs
(signals -XMH, -XML).
Two bi-directional ports provide 16 general purpose I/O
pins to communicate with external devices. The RSC-164
has a high frequency (14.32 MHz) oscillator as well as a
low frequency (32,768 Hz) oscillator suitable for
timekeeping applications. The processor clock can be
selected from either source, with a selectable divider
value. The device performs speech recognition when
running at 14.32 MHz. The RSC-164 also supports
programmable wait states to allow the use of slower
external devices. There are two programmable 8-bit
counters / timers, one derived from each oscillator.
USING THE RSC-164
Creating applications using the RSC-164 requires the
development of electronic circuitry, software code, and
speech/music data files. Software code for the RSC-164
can be developed by Sensory or by external programmers
using the RSC Development Kit. For more information
about development tools and services, please contact
Sensory. A typical product will require about $0.80 -
$1.50 (in high volume) of additional components, in
addition to the RSC-164.
The following sample circuit provides an example of how
the RSC-164 might be used.
4
From the
Interactive Speech™
Line of Products
DATA SHEET
RSC-164
Sample Application Circuit
Preamp Vcc
Preamp Vcc
R1
100
R2
22K
3
2
R4
10K
C3
+
47uF
V_BIAS
11
+
C2
10uF
IGAIN1
R7
2.2K
4
U1D
LM324
14
U1B
LM324
7
4
R9
2.7K
C4
R12
4.7K
R16
680K
R10
1K
C5
R8
56K
R14
10
9
.22
.033uF
5%
11
.1
15K
+
C9
1uF
+
-
10uF
R17
100K
R18
1K
R19
10K
1%
11
Preamp Vcc
R11
30K
1%
C6
+
DECOUPLING CAPS
1
2
3
4
8
IGAIN0
+
-
R5
5.6K
10uF
R6
10K
1%
Preamp Vcc
U1A
LM324
1
R3
30K
1%
C1
+
U1C
LM324
4
R22
4.7K
C15
0.1
C17
0.1
U4
8
7
6
5
VCC
A0
NC
A1
SCL
A2
SDA VSS
24LC65
4
Preamp GND
PREAMP GND and AGND to be connected at
the power supply near the RSC-164
AGND(analog) input
AGND
5
R13
1.5K
C7
6
12
X1
J1
2
1
J2
2
1
.22
Electret MICROPHONE mic
C8
R15
10K
13
+
-
+
-
11
R20
120K
R21
220K
Input Preamp
C10 .001uF
5%
VDD
TP-1
C11
0.1uF
D[0..7]
VDD
C13
0.1uF
O0
O1
O2
O3
O4
O5
O6
O7
11
12
13
15
16
17
18
19
D0
D1
D2
D3
D4
D5
D6
D7
AVDD
VDD
R23
100K
SPEAKER
R24
LS1
100(TBD)
C12
VDDi
D7
D6
D5
D4
D3
D2
D1
D0
VDD
470pF
R25
47K
To use ext EPROM(U4): use R26
To use int ROM in RSC-164: use R25
R26
47K
61
62
63
64
65
66
67
68
1
2
3
4
5
6
7
8
9
28
A0
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
A12
A13
A14
A15
10
9
8
7
6
5
4
3
25
24
21
23
2
26
27
1
20
22
A0
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
A12
A13
A14
A15
CE
OE
27C512
SH
A_IN_1
A_IN_0
GND(analog)
BUF_OUT/PWM0
TE/PWM1
VCC(ANALOG)
VDD(core)1
GND(core)1
D7
D6
D5
D4
D3
D2
D1
D0
Use EPROM if not ROM MASK
U2
RDC
14
Transistor should have low (.1V) Vce(sat), @
a forced beta of 100 @ 100mA Ic. V(br)ebo
should be above 5V.
NEC's 9012H is an acceptable example.
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
A15
A14
A13
A12
A11
A10
A9
A8
GND(I/O)1
VDD(I/O)1
A7
A6
A5
A4
A3
A2
A1
U3
SENSORY RSC-164
DAC
/XML
/XMH
PDN
/WRD
/RDD
/WRC
/RDC
GND(I/O)2
VDD(I/O)2
P0.0
P0.1
P0.2
P0.3
P0.4
P0.5
P0.6
60
59
58
57
56
55
54
53
52
51
50
49
48
47
46
45
44
IGAIN0
VDDi
RDC
P0.7
P1.0
P1.1
P1.2
P1.3
P1.4
P1.5
P1.6
P1.7
VDD(core)2
GND(core)2
/RESET
XI1
XO1
XO2
XI2
A0
VDD
U2-28
C14
U2-14
0.1uF
43
42
41
40
39
38
37
36
35
34
33
32
31
30
29
28
27
Power Supply
PNP
10
1.5V
3.9K
PREAMP Vcc
1.5V
47uF
Power Switch
1.5V
+
47uF
+
47uF
+
47uF
+
AVDD
VDD
VDDi
2.2
2.2
IGAIN1
VDD
U1-4
C16
U1-11
0.1uF
R27
100K
VDD
C18
0.1
C22
0.1uF
C19
C20
27pF 27pF
Y1
14.3MHz
Ceramic Resonator or Crystal
Reset
Circuit
Oscillator
PREAMP GND
RSC AGND
DGND
RSC-164 INSTRUCTION SET
The instruction set for the RSC-164 has 52 instructions
comprising 8 move, 7 rotate, 11 branch, 11 register
arithmetic, 9 immediate arithmetic, and 6 miscellaneous
instructions. All instructions are 3 bytes or fewer, and no
instruction requires more than 8 clock cycles to execute.
extended durations of speech and music synthesis, and
enhanced product functionality.
Separate data and address buses allow use of standard
EPROMs, ROMs, SRAMs, and flash memory with little
or no additional decoding. Provision of separate read and
write signals for each external memory space further
simplifies interfacing. The RSC-164 includes 8 data
lines (D[7:0]) and 16 address lines (A[15:0]), along with
associated control signals for interfacing to external
memory.
Using flash memory and EEPROM will require custom
code development. The RSC-164 can connect serially
through two I/O lines to a serial EEPROM for
applications with low data storage requirements.
GENERAL PURPOSE I/O
The RSC-164 has 16 general purpose I/O pins (P0.0-
P0.7, P1.0-P1.7). Each pin can be programmed as an
input with weak pull-up (~200kΩ equivalent device);
input with strong pull-up (~10kΩ equivalent device);
input without pull-up, or as an output. This is
accomplished by having 32 bits of configuration registers
for the I/O pins (Port Control Register A and Port
Control Register B for ports 0 and 1).
OSCILLATORS
Two independent oscillators in the RSC-164 provide a
high-frequency clock and a 32kHz time-keeping clock.
The oscillator characteristics are as follows:
Oscillator #1:
Pins XI1, XO1
14.32 MHz (3.5V-5.0V)
EXTERNAL MEMORY
The RSC-164 includes an external memory interface that
allows connection with memory devices for speaker-
dependent speech recognition, audio record/playback,
From the
Interactive Speech™
Line of Products
5