Welcome to %s forums

BrainModular Users Forum

Login Register

Better support for multicore processor ?

Tell us what you'd like Usine to do
Post Reply
fabthesabre
New member
Posts: 7
Contact:

Unread post by fabthesabre » 02 Dec 2008, 19:34

Hello,

Is it possible to add the functionality, as in the "Windows Task Manager", to choose (manually ?) an affinity for a core ? ie : VST x on core 1, and VSTi y on core 2 ? etc etc ... (or track 1 on core 1, track 2 on core 2 etc ...)
Because despite my BiiiiG processor (e8400 @ 4GHz), the multi-core support doesn't work very well. The ressource is limited to the use of only one core. I have tried two instances of Usine, and manually define the affinity in the "Windows task manager", and it works very well, but it isn't very practical.

Thanks,

Fab

PS: I just dicover Usine, following the advice of a friend (you know who you are). It's really a great piece of cake ! and the next level that you promise (transparency, skinable interface, still more function) will make me a glad customer in near futur I think. Usine Powâ ! :)

martignasse
Site Admin
Posts: 611
Location: Lyon, FRANCE
Contact:

Unread post by martignasse » 02 Dec 2008, 20:39

hi fabthesabre,

welcome in Usine world... :)

Like said in this thread :
http://www.sensomusic.com/forums/viewtopic.php?id=997

Usine use on core for audio and another for display.

I don't know if it's possible to assign a specific VST to a specific core inside usine.

As the modular way usine work, it's not easy to dispatch the audio engine one more than one core (thread).

let see what the boss can say about this.
Martin FLEURENT - Usine Developer - SDK maintainer

User avatar
senso
Site Admin
Posts: 4424
Location: France
Contact:

Unread post by senso » 03 Dec 2008, 15:17

You're right Martin.

In fact the pb is not solvable in Windows (and probably other OS).
The thread scheduling is around 50ms and in audio the scheduling has to be 0.01ms...
It means that I could implement a real multi-thread engine but it will increase the overall latency by 2x50ms (with a system of bufferization).
And for 'Live oriented' soft 100ms of latency is ... impossible...

That is why for example in reaper, when you push the play button the audio engine starts 0.5 seconds later...
Now imagine that you trigg a sample on stage and the sound appears 0.5 seconds later!

That's also why there is no dual core optimization in reason, even if they have probably 20 developers in the team...

On some special effects like a reverb, you can use the multi-thread to calculate the tail of the sound which is supposed to arrive much more later, but it's a special case.

It's not definitive of course but there is actually no technical solution.

Clearscreen
Member
Posts: 482
Location: Australia
Contact:

Unread post by Clearscreen » 26 Mar 2009, 00:33

i've been thinking about this and the way that plogue bidule can assign "tree's" of modules to different cores - for example, if you right click a module (bidule..) there's an 'mp assign' setting and you can select to set that modules (and all modules connected above it) to be assigned to a core other than than core 1 (ie core 2, 3, 4 etc...). is there a chance something like this could work for usine? whether it be set by module or by track? or even have a usine module that sets the affinity of everything chained above it to run on a higher (selectable) core number? i can see where the open modularity of usine might make this tricky (eg often i'll have a midi sequencer running out to several different synths, so if i set one synth to core 2 how does that affect the other synths the sequencer is also connected to....) but i thought there might be something useful in it.

lately i've been running up against the limit's of one core by experimenting with a lot of impulse response stuff and it made me wonder about it...

User avatar
senso
Site Admin
Posts: 4424
Location: France
Contact:

Unread post by senso » 26 Mar 2009, 09:09

But the multicore dispatching affects the latency...
Actually there is no real solution especially if we consider that Usine is about 10x more complex than Bidule (which is also a great soft).
but be patient one day I'll find...

Clearscreen
Member
Posts: 482
Location: Australia
Contact:

Unread post by Clearscreen » 26 Mar 2009, 12:07

no worries, just a thought...

gurulogic
Member
Posts: 1019
Contact:

Unread post by gurulogic » 28 Mar 2009, 01:24

This might be naive and I definately don't know anything about how this stuff actually works, but how about a host shell for Usine that can automaticly load an instance of Usine for each core on the machine and have a single common save/load function and a tabbed interface for switching instances and perhaps some shared memory functions? If something like the Steinberg multiclient ASIO driver were used, then all instances would be able to share the same audio ports.
Then if there were a special bus that acted similar to jsoundbus for cascading audio and data streams to the next appropriate instance, and some limitations based on mulicore friendlyness as to what could be routed where so as to keep things efficiant.
I guess though that despite all of this fancy thinking it would probably just make more sense to have each track in Usine assigned to it's own core and have some routing rules in place for multicore use...

I am essentially doing the above in Live 7 and it is working quite well. I have four copies of Usine VST loaded onto four return tracks in Live and an instance of jsoundbus inserted on each audio input channel in Live,each of which is terminated as audio to sends only.

Two of the return channels are identical dedicated Usine FX racks and each half of my audio sources are jsoundbussed directly into the Usine FX rack instances. The other two Usine racks are a combo of VSTI and master processing racks and I jsoundbus directly from the VSTI's into the Usine FX racks for further processing. So far all I have added is an addition 5.8ms latency and all my audio is perfectly in sync.

I can't at the moment recall exactly why I had to do it this way, something to do with delay sync in conjunction with the master processing or putting an Usine instance on the master in Live seemed to mung up the core usage balance (that was a long day), but I have the FX racks feeding to the master mix and then a jsoundbus feeding the master mix into the last Usine instance which is routed to the outputs the master mix would have been using so this presumably has added another 5.8ms of latency but I really don't notice.

So yeah, that's just what I have had running through my head on the matter...and for what I am doing, I am really only using Live as a way to balance my CPU load and manage my projects and up untill I discovered Usine, as a pretty interface. So yeah, I don't use Live as anything it was intended for so I could easily ditch it in favor of a more multprocessing friendly Usine solution...

User avatar
senso
Site Admin
Posts: 4424
Location: France
Contact:

Unread post by senso » 28 Mar 2009, 21:01

I guess though that despite all of this fancy thinking it would probably just make more sense to have each track in Usine assigned to it's own core and have some routing rules in place for multicore use...
Of course I could do that but but with a latency at least of 50ms witch is too much...

gurulogic
Member
Posts: 1019
Contact:

Unread post by gurulogic » 29 Mar 2009, 03:45

Somehow Ableton Live and Jsoundbus are doing magic with Usine because I have crackle free multicore processing working very well with reasonably low latency.
I just took some time to do some tests to confirm for myself that there were no suprises waiting in this setup that I am more building than actually using right now.

Test 1: Enable devices in my four Usine return channel inserts untill I have 70% cpu load on each of 4 cores, which includes audio tracks from Live running directly via Jsoundbus to two alternate FX rack instances of Usine, and two Usine instances containing VSTI'salso running audio directly via Jsoundbus to the Usine FX racks and the outputs of each Usine FX rack being run via Jsoundbus back into one of the VSTI Usine instances and then to my soundcard output.
A pre recorded audio file plays through this entire chain with 12ms of latency as tested by recording the audio from the end of the chain. This would be 6ms if I were not using Jsoundbus to run back into an Usine instance for master processing.
Delay compensation is Live is disabled for these tests.

Test 2: With the same cpu load I connect a physical output of my soundcard to an input of my souncard for a round trip latency test @ 256 sample buffer. The strange part here is that I have a 5ms round trip when it should be 12-14ms..???


Unless there is something that I am completely missing that is going to screw me in the end, this seems like an ideal workaround for anyone needing to use more cores with Usine. Of course you will need a host that cooperates and lots of ram to spare. If anyone does want to try this and has any questions, I would be happy to try and help.

I have attached a picture for fun...

Image

23fx23
Member
Posts: 2545
Contact:

Unread post by 23fx23 » 29 Mar 2009, 04:39

sounds really interesting..

amiga909
Member
Posts: 324
Contact:

Unread post by amiga909 » 29 Mar 2009, 11:00

interesting +1 for that ableton example.

did not consider quadcore yet. and I dont have much knowledge about it.
multicore + audio seems pretty confusing however.

IMHO apart from the technical possibility Usine could be faster with quadcore, maybe its important to consider that introducing multicore and rewrite an app for hyperthreading is one hard task, maintaining and debugging multicore audiosoft another, maybe even harder task.

personally I am not into the multicore 'hype' nowadays: its considered to be the major requirement for any audio soft, I feel. I rather opt for software that works consistent and predictable, and does not surprise me with unexpected cpu spikes.

gurulogic
Member
Posts: 1019
Contact:

Unread post by gurulogic » 31 Mar 2009, 01:04

Another interesting finding is that jsounoundbus seems to need to be sample locked by the hosts at both source and destintion.
If I try using jsoundbus to route audio between standalone applications, it craps out but if I run it in both a host and rewired to host application it runs just fine. This means there is probably no way to use jsoundbus to tie together multiple running copies of Usine standalone, unless it were somehow possible to samplelock the sync of both instances.
This is just an assumption based on my findings...

User avatar
senso
Site Admin
Posts: 4424
Location: France
Contact:

Unread post by senso » 11 Apr 2009, 21:38

The next version of Usine will be really dual core optimized.
After few month of research, I have now the technology to synchronize audio threads without increase the latency.
Image

The Quad Core optimization will follow soon.

gurulogic
Member
Posts: 1019
Contact:

Unread post by gurulogic » 11 Apr 2009, 23:31

Woot! That's great news!
Any chance to also allow a buffer setting for the Usine vst inlets? I believe this will allow proper multicore Use in VST host's such as Live, and another few ms of latency is not too bad when playing computer generated or non realtime monitoing critical instruments.

As an alternative to my previously posted jbridge solution (which can sometimes have problems when the send and receive instances are not initiated in the right order), I have found another workaround for using multiple audio i/o in usine VST in a multiprocessing host.
The solution is to create a rack on a Live midi track and create a chain and then insert a copy of MidiAudio_throughput http://nwrecords.com/storage/ableton_re ... ughput.dll and Usine vst inlet for each of the audio ins and outs you want to have available fror Usine VST. Now you can send any audio track in Live to a Usine inlet or receive from the Usine inlet to any audio track This adds additional latency as set by your audio hardware. For no additional latency you can directly access your harware i/o by inserting the Live external audio effect before and after the Usine instance instead of using the MidiAudio_throughput workaround.
This allows you have a multi i/o Usine running on one cpu core while you distribute other VST load (or copies of Usine) across the remaining cores.

Perhaps this would be good stuff for the wiki?

Image

Clearscreen
Member
Posts: 482
Location: Australia
Contact:

Unread post by Clearscreen » 15 Apr 2009, 01:14

senso wrote:The next version of Usine will be really dual core optimized.
After few month of research, I have now the technology to synchronize audio threads without increase the latency.
http://www.sensomusic.com/forums/upload ... alcore.jpg

The Quad Core optimization will follow soon.
Great news!! Thanks for putting the work into this senso :)

runagate
Member
Posts: 288
Location: Austin, Texas, USA
Contact:

Unread post by runagate » 15 Apr 2009, 02:01

I'm definitely excited by this news now that I'm rocking a 4Ghz i7 quad.

Post Reply

Who is online

Users browsing this forum: No registered users and 81 guests