Reply
  • proper ๐Ÿ”ฉ
    Dec 21, 2025
    ยท
    2 replies
    Platinum

    My point is that you wouldnt want to pick and choose albums - complete opposite of what you are talking about. Feel you arent reading my messages as Im obviously not talking about your average music listener

    oh u guys are talking about some ai s*** lol nvm

    idgaf about ai

  • Dec 21, 2025
    ยท
    1 reply
    Oblivion X

    Yea you dont need to download the whole thing. But if ur not downloading the whole thing, whats the difference than just using another music scrapper where you can download music in batches already ?

    That scale is definitely feasible with other scraping tools. You can download whole discographies easily via torrents that amounts to that size pretty easily.

    If youre super determined yes thats true, but my feeling is your average ML person is not going out there way to download that amount of songs, with the needed thought into what disocgraphies are good to search for, how big the discography is etc. Whereas now with the dataset readily available its an 'interesting problem' ready for someone to tackle.

    So thats what I mean about less friction. Its like having a kaggle competition ready for you to tackle. (I assume you work in the field based on your msgs, but if not then apologies if im using terms youre not familar with)

    I also think close to 300 TB is not too much of an ask for an inidividual/team that really wants to tackle this. You wont be storing that data long term and will never need to store it all at once as youll just change it all into features to train your model on and then youll delete the audio files.

    Anyway in case this comes across as rambling now then sorry lol, was a good convo though

  • Dec 22, 2025
    ยท
    1 reply
    Platinum

    If youre super determined yes thats true, but my feeling is your average ML person is not going out there way to download that amount of songs, with the needed thought into what disocgraphies are good to search for, how big the discography is etc. Whereas now with the dataset readily available its an 'interesting problem' ready for someone to tackle.

    So thats what I mean about less friction. Its like having a kaggle competition ready for you to tackle. (I assume you work in the field based on your msgs, but if not then apologies if im using terms youre not familar with)

    I also think close to 300 TB is not too much of an ask for an inidividual/team that really wants to tackle this. You wont be storing that data long term and will never need to store it all at once as youll just change it all into features to train your model on and then youll delete the audio files.

    Anyway in case this comes across as rambling now then sorry lol, was a good convo though

    I mean they could just pick the top listed artists and just scrap their discogs, it would even be better since it would be in higher quality than this database.

    While yeah having it all in one place makes it easier to download, I think the difference is not that much to be against this.

    For a team doing the full 300tb, storing the data wouldn't be the biggest roadblocks it would be the training of all that data and how much power it would take

  • Dec 22, 2025

    Lmao

  • Dec 22, 2025
    ยท
    1 reply
    suzuki

    Slsk all u need

    Wrong

  • Dec 22, 2025

    F*** Spotify

  • Dec 22, 2025
    ยท
    edited
    ยท
    1 reply
    Oblivion X

    I mean they could just pick the top listed artists and just scrap their discogs, it would even be better since it would be in higher quality than this database.

    While yeah having it all in one place makes it easier to download, I think the difference is not that much to be against this.

    For a team doing the full 300tb, storing the data wouldn't be the biggest roadblocks it would be the training of all that data and how much power it would take

    I dont really see much correlation between the 300 tb files and the power needed to train a model. That 300tb will be much reduced when you convert it into data you will actually train on.

    Its possible we are thinking of training on different things though.

    The point you make about it being higher quality on things like soulseek is true I agree. Ultimately though I think what will probably happen is you get some startup or existing company download the data and use it in a model thats used behind the scenes in some type of serivce to labels/muscians.

    Do you work/study in the field?

  • Dec 22, 2025
    ยท
    1 reply
    Platinum

    I dont really see much correlation between the 300 tb files and the power needed to train a model. That 300tb will be much reduced when you convert it into data you will actually train on.

    Its possible we are thinking of training on different things though.

    The point you make about it being higher quality on things like soulseek is true I agree. Ultimately though I think what will probably happen is you get some startup or existing company download the data and use it in a model thats used behind the scenes in some type of serivce to labels/muscians.

    Do you work/study in the field?

    Larger training models end up using more energy in terms of power to train

    Not data science, but Electrical and Computer engineering

  • Dec 22, 2025
    Oblivion X

    Larger training models end up using more energy in terms of power to train

    Not data science, but Electrical and Computer engineering

    Yes thats true but I meant that it won't be trained on literally 300TB of data. (There is a correlation between dataset size and power needed to train but 300TB audio files =/= 300tb dataset size)

    And cool, nice Eletrical engineering is still such a good foundation to have

    Anyway I feel Ive derailed the thread enough lol

  • Dec 22, 2025
    proper

    oh u guys are talking about some ai s*** lol nvm

    idgaf about ai

  • Dec 22, 2025
    Benny Boy

    Soulseek has the stuff that isn't on streaming. Would be lost without it tbh

    I mean did I say otherwise

  • Dipset Forever

    Wrong

    What else do you use??

  • Apr 15
    ยท
    2 replies

    Welp

  • proper

    oh u guys are talking about some ai s*** lol nvm

    idgaf about ai

  • Laced

    Welp

    https://twitter.com/i/status/2044446399811715389

    in the immortal words of Bossman Dlow

    "I ain't never turned myself in
    Do your job b****, come find me!"

    they might've won that default judgement but it'll be hell getting any of the money they're now owed

  • Laced

    Welp

    https://twitter.com/i/status/2044446399811715389

    gotta make sure poor old spotify gets theirs ๐Ÿ’”๐Ÿ’”๐Ÿ’”