To add to this insight, there are many recent publications showing the dramatic improvements of adding another modality like vision to language models.
While this is my conjecture that is loosely supported by existing research, I personally believe that multimodality is the secret to understanding human intelligence.
You do realize that every posted on the Fediverse is open and publicly available? It’s not locked behind some API or controlled by any one company or entity.
Fediverse is the Wikipedia of encyclopedias and any researcher or engineer, including myself, can and will use Lemmy data to create AI datasets with absolutely no restrictions.