Screen Shot 2026-03-02 at 4.18.40 PM.png
- 444.29 KB
(2003x1640)
So I've been working on a project to create a singing voice synthesizer I'm basing it on the description in Jordi Bonada's PhD thesis "Voice Processing and Synthesis by Performance Sampling and Spectral Models". After a lot of trouble with getting TWM f0 estimation to work, I've finally gotten to implementing MFPA (Maximally Flat Phase Alignment". And amazingly, it seems to have worked first try. Compare my results: https://i.ibb.co/dsvgv0fd/Screen-Shot-2026-03-02-at-3-54-48-PM.png To the results in the study: https://i.ibb.co/C3fjdWVd/Screen-Shot-2026-03-02-at-3-55-09-PM.png












