Speech Engineer

Storytel

OBS! Ansökningsperioden för denna annonsen har passerat.

Arbetsbeskrivning

At Storytel we believe that powerful stories add an extra dimension to life. We offer hundreds of thousands of audiobooks and ebooks to customers in more than 20 markets, with several new markets launching in the coming year. Storytel is Northern Europe's largest audiobook streaming service and we’re now looking for a tester to join the team! Storytel’s vision is to make the world a more empathic and creative place with great stories to be shared and enjoyed anytime, anywhere by anyone.

About the Team

The role is in the Speech team, a part of the larger Intelligence group which houses our machine learning and data science teams. In the Speech team we build services that enable Storytel to efficiently generate new, and understand existing content. In particular, our team owns the entire Text-to-Speech stack at Storytel, from data curation to modelling decisions, training and deployment infrastructure. In order to accelerate the development, and get the system in production, we are growing our team. Since the team is new, each position we're hiring for is considered essential. Our new team members will be expected to take on large responsibilities and will impact all aspects of our work. While each role's main responsibilities are different, we will all work closely together to achieve our big ambitions of highly automated and prosodically rich speech synthesis.

We are an international company with colleagues in the larger Intelligence team in Stockholm, Barcelona and Copenhagen. The Speech team is currently based in Stockholm, and while we hope to keep building the team in the Stockholm offices we are open to work with the right candidates to find a solution that is great for both parties.

About the Role

As a Speech Engineer you will have a large responsibility for the components closest to the generated audio, from audio preprocessing, architecture of spectrogram prediction and vocoder model. You will work with the team to evaluate our stack using a combination of quality assessment methods and live evaluation, and tune our components to improve our result. To rapidly scale our work you will work to find ways of more effectively using our existing data. Finally, since the field is evolving rapidly you will also need to keep our stack up to date with developments in speech technology and interact with the research and open source communities.

Responsibilities

- Work together with the team to implement models, training and deployment of speech services, in particular TTS.
- Take a large role in defining our model architecture based on recent research, in particular models close to audio, including spectrogram prediction and vocoders.
- Develop and maintain our audio preprocessing pipelines.
- Ensure that we have relevant and strong metrics for offline evaluation of generated content, and improve our components based on these.
- Identify areas for improvements in our datasets based on quantitative evaluation.

About You

We believe that you are passionate about speech technology, see its potential and eager to use it to build something impactful. You’re interested in staying in touch with the field as it evolves and eager to expand your knowledge of how to put your work in production.

To be successful in this role we believe that you have

PhD or MSc degree, or equivalent industry experience, in Speech technology, Machine Learning, Computer Science, Mathematics, Physics, or a related field

️Research or industry experience in audio and speech processing using neural networks, e.g. text-to-speech, speech recognition, or similar topics

️Expansive knowledge and understanding of modern deep learning

Strong understanding of the current state of text-to-speech, including developing trends and current State-of-The-Art models

️Familiarity with common audio processing methods and their corresponding terminology

‍💻Comfortable working in Python and one of the frameworks: Tensorflow, PyTorch or JAX

🇧Excellent at writing and speaking English

While not required, we would also love to hear about any of your:

- Experience in building qualitative listening tests for either online crowds or experts.
- Packing models and audio processing pipelines for deployment.
- Experience with distributed training of DL models.
- Experience with both RNN and Transformer based TTS.
- Contributions to open source frameworks or tools for speech processing.
- Experience using tools like Google Cloud, Docker, Kubernetes or Kubeflow.
- Published in top tier ML conferences like ICASSP, Interspeech, ACL, NeurIPS, ICLR, ICML, AAAI

What we offer

- Participate in developing a top-notch streaming entertainment platform used by over a million users worldwide
- Plenty of autonomy and responsibility
- Your own yearly education budget
- A workplace that values creativity and personal initiative
- Limitless audio and ebooks from our own service
- An international team of super-talented colleagues
- Explore, work and implement some of the newest and hottest technologies
- A company full of book lovers
- Ability to work from any of our offices in Umeå, Stockholm, Karlstad, Lund and Copenhagen

Does this sound like you? If you feel like Storytel is a place where you could thrive, let us know and we will contact you as soon as possible.

Kontaktpersoner på detta företaget

Maria Runesson

Hanna Lindroth

Josefine Neurath

Sammanfattning

Arbetsplats: Storytel
1 plats
Tills vidare
Heltid
Fast månads- vecko- eller timlön
Publicerat: 2 september 2021
Ansök senast: 19 februari 2022

Besöksadress

Valhallavägen 117H, Stockholm
None

Postadress

Valhallavägen 117H
Stockholm, 11531

Speech Engineer

Arbetsbeskrivning

Kontaktpersoner på detta företaget

Sammanfattning

Besöksadress

Postadress

Liknande jobb

SAP Advanced Key User

Objektsledare IT

Cybersäkerhetsspecialist till Knowit Swedspot

Cybersäkerhetsspecialist till Knowit Swedspot