Resources Article Upgraded - The Deepgram JavaScript SDK v3

Upgraded - The Deepgram JavaScript SDK v3

Luke Oliff

Published on 12/13/23

In the ever-evolving landscape of JavaScript development, staying abreast of the latest tools and libraries is crucial. The JavaScript SDK, a fundamental building block for many Deepgram applications, has undergone significant changes to enhance its capabilities and improve developer experience. This blog post takes you on a journey through the noteworthy changes and new features introduced in the latest version.

Quick Tour

Now that we've covered the major changes, let's take a quick tour of the new features introduced in the JavaScript SDK.

Prerecorded Transcription

Example Code

import { deepgram } from "@deepgram/sdk";

const { result, error } = await deepgram.listen.prerecorded.transcribeUrl(
  {
    url: "https://dpgr.am/spacewalk.wav",
  },
  {
    model: "nova",
  }
);

Live Transcription

Example Code

import { deepgram, LiveTranscriptionEvents } from "@deepgram/sdk";

const dgConnection = deepgram.listen.live({ model: "nova" });

dgConnection.on(LiveTranscriptionEvents.Open, () => {
  dgConnection.on(LiveTranscriptionEvents.Transcript, (data) => {
    console.log(data);
  });

  `YOUR_AUDIO_SOURCE`.addListener("got-some-audio", async (event) => {
    dgConnection.send(event.raw_audio_data);
  });
});

Changes

Let's delve into the key changes that make the new JavaScript SDK stand out among its predecessors.

ESM and UMD Support

ESM (ECMAScript Modules) was one of the most requested features from our community of users.

CommonJS, initially tailored for server-side development and widely adopted, employs synchronous loading, ensuring dependencies are fully loaded before code execution. It seamlessly integrates with Node.js and relies on mechanisms like module.exports or exports for value exportation. While offering dynamic imports, it is not as prevalent as in ESM. On the other hand, ESM, introduced in ECMAScript 6, brings a modern, asynchronous loading approach. With native support for dynamic imports, asynchronous loading, tree shaking, and encapsulation, ESM aligns well with the needs of contemporary web applications.

Considering the dynamic landscape of JavaScript development, the inclusion of both ESM and UMD in our SDK acknowledges the diversity of projects and preferences within the developer community. This strategic addition ensures that our SDK accommodates various module systems, providing flexibility for developers to choose the approach that aligns with their specific project requirements and environments.

WebVTT and SRT Captions

The significance of captions in enhancing content accessibility cannot be overstated. In our latest JavaScript SDK, a deliberate decision has been made to segregate the captions functionality, creating a standalone library. Both WebVTT and SRT captions are now accessible as a versatile standalone package, embodying flexibility and modularity.

This move not only supports Deepgram's proprietary transcription format but also extends compatibility to captions from other providers (and we welcome contributions to add more).

import { webvtt /* , srt */ } from "@deepgram/captions";

const { results } = /* get results from deepgram */;

const vttOutput = webvtt(result);
// const srtOutput = srt(result);

Separate Callback and Synchronous Transcription Methods

Our callback feature has been separated into its own transcription methods. This allows for more robust usage, and less complexity in our type system.

const { results } = deepgram.listen.prerecordedUrl(source, options);

// vs

const { results } = deepgram.listen.prerecordedUrlCallback(source, url, options);

JavaScript Browser and Node.js Friendly (Isomorphic)

The SDK ensures a seamless experience across JavaScript and Node.js environments, embracing isomorphic principles. Whether you're crafting a web application or a server-side script, the SDK has you covered. Transcription in client-side frameworks like React and Vue has been high on the list of wants from our community.

If you're planning to do REST-based transcription in a client application, we've also built the SDK with proxy support included. You can now configure a proxy into your SDK config, to allow you to use the best features while on the client-side. We've even included a demo proxy application to get you started.

Improved Live Transcription Events

We have improved the support for different transcription events coming back from our websocket connection.

Switch from request to fetch

In response to the evolving standards of web development, the SDK transitions from the deprecated request library to the modern and widely adopted fetch API, which standards are available natively in many runtimes including most browsers. This reduces the dependencies involved in our SDK.

Initialization by Function Instead of Class

The new SDK adopts a function-based initialization approach, offering simplicity and consistency in how developers set up and configure the transcription service.

Scoped Constructor Config

With scoped constructor config, developers gain better control over the SDK's behavior, allowing for fine-tuned customization based on specific use cases.

Better Errors

Meaningful error messages are paramount for debugging and troubleshooting. The SDK now provides clearer and more informative error messages, streamlining the development process.

Support for Future Products

Anticipating the future needs of developers, the SDK is designed to seamlessly integrate with upcoming products, ensuring long-term compatibility and extensibility.

Support for On-Prem Deployments

For applications with stringent privacy and security requirements, the SDK introduces support for On-Prem deployments, enabling organizations to maintain control over their transcription infrastructure.

Some Example Apps

Here are our JavaScript SDK ready example apps.

Node pre-recorded starter app - A native JavaScript frontend, using a Node server application to interact with Deepgram to transcribe pre-recorded audio.
Next.js microphone starter app - A Next.js application, capturing the microphone audio and sending it to Deepgram from the browser.
JavaScript microphone demo - A native JavaScript frontend, capturing the microphone audio and sending it to Deepgram from the browser.
Node microphone demo - A native JavaScript frontend, capturing the microphone audio in the browser, and sending it to a Node websocket server, which in-turn sends the audio off to Deepgram.
Node live example - A small standalone server-side script that captures live audio from a BBC audio feed and sends it to Deepgram.
Node pre-recorded example - A small standalone server-side script that sends a pre-recorded file to Deepgram.