Design Notes
Design Notes
Where we started
Web-888 originates from the RX-888 project. Since the launch of RX-888, we have noticed an increasing number of applications using RX-888 for "web-based" functionalities. Notable examples include the KA9Q-Radio and PhantomSDR projects. However, to achieve WEB-SDR functionality, it typically requires a Linux PC or a Raspberry Pi 5, which is neither the most convenient nor economical solution.
Therefore, the RX-888 Team decided to design a plug-and-play Web-888. We believe that Web-888 should have the following features:
- Plug-and-play capability, requiring only an Ethernet cable and an antenna to work. With an excellent UI and operation, it can run directly on any browser without needing Linux knowledge, as not everyone is a software expert.
- Support for as many online users as possible, with enough RX channels to allow one SDR to function like multiple SDRs working in parallel.
- No need for an additional PC, Raspberry Pi, or Beaglebone board; it works as a single board.
- Low network bandwidth usage, with decoding completed on the SDR end, reducing the dependency on network environment and speed.
- Excellent reception performance, featuring high sensitivity, high dynamic range, and wide bandwidth.
- Cost-effective and affordable.
Based on these requirements, we have initially conceptualized the architecture of Web-888. It requires a high-resolution, high-sampling-rate ADC, LNA, ATT, LPF, a resource-rich FPGA, a sufficiently powerful CPU, rich memory, and a fast Ethernet PHY.
Hardware
We start with original RX-888, retaining the excellent RF design of ATT+LPF+LNA+ADC, and have re-optimized the LPF design to enhance suppression above 60MHz, catering to wideband reception from 0-60MHz. Additionally, we have introduced a second antenna channel equipped with a 118-145MHz BPF and a +20dB LNA, allowing the powerful LTC2208's 130MSPS sampling rate to directly achieve under-sampling for Air-Band/VHF in the second Nyquist zone, eliminating the need for additional converters.
For the digital backend, we drew inspiration from the classic PlutoSDR design, employing Xilinx's ZYNQ XC7Z010. The integrated FPGA+ARM design of the ZYNQ facilitates high-speed data sharing between PL and PS via AXI (Advanced Extensible Interface), resolving the traditional challenges of data interaction in separate FPGA+CPU designs. Using the ZYNQ also addresses the need for both FPGA and CPU, enabling a more streamlined design that fits the entire SDR hardware onto a single PCB. We also included Realtek's 1000M Ethernet in the Web-888 to ensure minimal network latency and stability.
Considering the convenience of connecting SDR via Wi-Fi, we provide a USB C interface with a USB2.0 HOST for directly connecting a USB Wi-Fi Dongle. It can also used as UART bridge to support CAT commands from other SDR software.
By utilizing a 0.5ppm TCXO as the ADC's clock source, the necessity for GPS is reduced. Nonetheless, we have included a GPS module supporting BDS/GPS/GLONASS/GALILEO, with PPS for clock calibration. Additionally, we offer reference clock input and output options. When used as an output, the GPS module will function, allowing the Web-888 to provide a GPSDO-grade variable clock source.
Firmware (FPGA)
This project implements fully digital intermediate frequency signal processing in an FPGA. It includes 15 channels of digital downconversion, with 13 channels of 12kHz sampled audio data processed through meticulously designed and debugged multistage filters and a custom-developed AXI bus system. Two channels of spectrum data utilize a custom high-bandwidth DMA bus design, achieving a maximum rate of 2GB/s and an operating rate of 1GB/s. The maximum sampling rate of 130MHz ensures that signals within a 60MHz bandwidth are clearly visible during scaling, allowing for seamless zooming with detailed clarity. The professional timing optimization design provides ample margins, ensuring long-term stable operation in various environments.
The Web-888 utilizes a full AMBA-AXI bus architecture, providing reliable high-bandwidth multi-channel data transmission. Data transfer between the dual-core ARM A9 processor and the FPGA (PL) is facilitated by four sets of AXI3-Full buses. One GP (General Purpose) bus is used for configuring registers and reading statuses, while one HP (High Performance) bus handles DMA transfers for 13 channels of received IQ data. Two additional HP buses manage DMA transfers for two spectrum data channels. These two DMAs use a time-division multiplexing scheduler to display 13 channels of 60MHz full-bandwidth waterfall spectrum.
The general configuration interface supports high-performance burst transfers, ensuring real-time configuration of a large number of internal registers and accurate reading of the GPS PPS counter values.
The high-performance data bus features a unique design, achieving 13-channel digital downconversion, IQ data transmission, and multistage filtering with minimal resource usage. This section employs digital mixing combined with a two-stage filtering design. After digital downconversion, the 13 signals produce 13 IQ signals, which are then processed through a first-stage CIC decimation filter. The 26 signals are converted into 26 serial data packets via a custom-developed bus interconnect module. These serial data packets then enter the second-stage multiplexed CIC filter. The output data from the second-stage CIC is packaged into specified format data packets using a custom-developed packaging module and then transmitted through a high-performance custom-developed DMA.
The high-performance spectrum bus design is inspired by KiwiSDR to achieve a 60MHz full-bandwidth display. This section consists of digital downconversion, a variable sampling rate CIC, and a high-performance DMA. After downconversion, the sampling rate is adjusted according to the spectrum's zoom settings. The DMA used for spectrum data transmission operates in two modes:
Exclusive Mode: In this mode, two spectrum channels are dedicated to two clients, while other clients only receive a 12kHz bandwidth spectrum stream. The DMA operates in address loopback continuous transmission mode.
Time-Division Multiplexing (TDM) Mode: In this mode, the two spectrum channels are time-division multiplexed across 13 clients. After collecting a line of data, the DMA stops automatically and switches to collect the spectrum data for the next online user. The TDM mechanism is optimized by software based on the number of online users and zoom level, ensuring maximum DMA utilization.
If you are interested in more technical description of FPGA implementation, please read this document.
Software
RX-888 Team worked on RaspSDR project which is largely a re-platform project of KiwiSDR. KiwiSDR has an easy to use UI, rich set of features which is aligning our goal of Web-888.
On the other side, we do want to fix some problem of KiwiSDR while maintaining the nice user experience of KiwiSDR. The key problem we were trying to solve:
Software Update
Using source code to distribute is painful. It requires the user to spend minutes to a hour to rebuild the project. This also limited the build technology we can use.
So in Web-888, we decided to use binary update to speed this. In additional to that, we created two channels for updates. One alpha channel is for the users who always want the latest features and has the ability to fix the problem if they encounter. Another stable channel is for the user who want a stable build, or unattended running servers.
Broken SD Card
Having SD card running a Debian system is a good choice but not the only choice. It exposes a great possibility that the SD card got corrupt. We decided to use a Linux Distro which supports readonly root partition. Alpine Linux gets popular as well as used by Red Pitaya Notes project.
Having a readonly SD card significantly reduce the chance to break SD card.
One side benefit is that we are able to put the config files in the root partition of SD card. They can be backup easily via a copy tool. Or they can be edited manually.
User Mode Task Scheduler
KiwiSDR code as well as RaspSDR code is using a user mode scheduling in order to grantee the strict requirement on timing to pull SPI bus.
Thank Zynq7010 design, we have a much flexible design to move the data from FPGA to CPU memory. In Web-888, we implemented a DMA controller in FPGA to move the data to CPU. During the movement, CPU is not involved at all. So we don't have such strict requirement on timing anymore.
We changed the code to use Linux kernel scheduler. Checking the CPU usage in Web-888, you may notice the CPU is very idle if there is no active user. The load is balanced between two cores. We now can leverage all CPU computing power and rely on Linux Kernel to best schedule the workload among two CPU cores.
One RX channel takes about 3% CPU of one core, one WF channel takes about 9%. In additional to CPU usage reduce, we also has much more flexibility to call blocking API which has to be called in RPC way in Kiwi. This simplified many code paths especially for the extensions like DRM, FT8, WSPR.
Other Innovations
We also put together several innovations in Web-888.
13 WF channels - shared mode
Zynq7010 can only have resource to put 2 WF channels. In order to support 13 WF channels, we developed a time sharing solution to use 2 WF channels shared between 13 concurrent users.
Inside FPGA, two DMA controllers are created to move the data from FPGA FIFO to CPU memory. When enough data (8192 samples) are moved, CPU copy out the data and trigger another transfer with next channels' frequency and decimate settings.
We have to implement a DMA controller which supports burst transfer in order to meet the timing requirement.
Tunable ADC Sampling rate
ADC's clock is not directly connect to a TCXO, instead it is connecting to CLK0 of si5351. This design is inherited from RX-888, which is proven offer a good balance of flexibility and quality.
We also choose 122.88MHz as our sampling rate for HF band. With the reference TCXO of 24.576MHz,
Item | Value |
---|---|
Wanted frequency | 122 088 000,000 Hz |
Crystal frequency | 24 576 000,000 Hz |
VCO frequency | 860 160 000,000 Hz |
True frequency | 122 880 000,000 Hz |
Deviation | 0 Hz / 0 PPB |
Inside boot-loader of Web-888, Si5351 is configured to 122.88MHz. Si5351 can be configured for other frequency for the different purposes.
VHF support with Oversampling
We decided to use oversampling to support VHF bands especially we can cover the whole air-band, which is a common use case on Internet.
However, we find out 122.88MHz sampling rate is not a good choice for air-band. Air-band is from 118MHz which is below 122.88, after oversampling, it will moved to 4.88MHz, which 127.76MHz is also moved to 4.88MHz and caused the problem.
Thank for Si5351, we can choose other sampling rate here like 101MHz, 98MHz, etc.
Poor man's GPSDO
PPS signal from GPS is used to calculate the ADC frequency. This is a feature from KiwiSDR. It used to correct the displayed frequency.
Since we can tune Si5351, We decided to use the PPS signal to governance Si5351 output frequency. A PID algorithm is chosen to take the tick count between two PPS signals to correct Si5351.
The PID parameter is below:
static const float32_t Kp = 1.4f; // Proportional gain
static const float32_t Ki = 0.15f; // Integral gain
static const float32_t Kd = 0.01f; // Derivative gain
Since PLL inside Si5351 is governance, we can use the same PLL to output clock signals externally. This is the clock-out SMA's output.
Warning
Enable this feature will impact the phase noise of Si5351. So we introduce a configuration in Admin Config Tab to control it and default disable this.
Thermal Management
Having a metal box offers not only a nice and solid product quality but also a good shied of RF. However it exposes the challenges of Zynq7010 temperature.
We uses several methods to optimize this.
- Optimize CPU usage to reduce heat from CPU.
- Optimize FPGA following low-power design
- Design the airflow to suck the air from two sides and take the heat.
However, all of the solution didn't make us satisfied. We finally add a Fan to the product. In order to reduce the noise, we use a larger Fan with a low RPM. In our testing sample, we can maintain the temperature under 50 degree.