Advancing European Sovereignty in HPC with RISC-V

andrew0@lemmy.dbzer0.com · 11 days ago

Be wary that their docs are so and so. Nanonets OCR, Mistral OCR and MinerU will also extract formulas and images.

One other model I forgot to mention is Docling. This one is quite quick to set up in a docker container, and will have a web interface ready to go where you can upload documents. This sort of follows the PaddleOCR pipeline, but also allows you to use vLMs.

Good luck!

andrew0@lemmy.dbzer0.com · 11 days ago

If you find that OCR doesn’t get you very far, maybe try a small vLM to parse PNGs of the pages. For example, Nanonets OCR will do this, although quite slow if you don’t have a GPU. It will give you a Markdown version of the page, which you can then translate with another tool.

PaddleOCR might also be useful, since it focuses on Chinese, but it’s more difficult to set up. To add to this, some other options are MinerU and MistralOCR (this is paid, but you can test it for free if you upload it in Mistral’s library).

andrew0@lemmy.dbzer0.com · 12 days ago

Sure, if all politicians make all their data available to the public. Their phone chat messages, photos taken, everything.

No…? Then don’t bring it up ever again. Initiatives like these will only make it look like you’re a villain if you want privacy.

andrew0@lemmy.dbzer0.com · 1 month ago

All the ones I mentioned can be installed with pip or uv if I am not mistaken. It would probably be more finicky than containers that you can put behind a reverse proxy, but it is possible if you wish to go that route. Ollama will also run system-wide, so any project will be able to use its API without you having to create a separate environment and download the same model twice in order to use it.

andrew0@lemmy.dbzer0.com · 1 month ago

Ollama for API, which you can integrate into Open WebUI. You can also integrate image generation with ComfyUI I believe.

It’s less of a hassle to use Docker for Open WebUI, but ollama works as a regular CLI tool.

andrew0@lemmy.dbzer0.com · 2 months ago

You really think this is all him? Guy’s got a full team running the media. The Heritage Foundation’s got its grimy hands all over the country.

andrew0@lemmy.dbzer0.com · edit-2 2 months ago

Didn’t this guy just go on French national TV and say that Macron is a dictator?

I think if this guy gets voted in, Romania might say bye bye to its EU funds.

andrew0@lemmy.dbzer0.com · 3 months ago

It would be a shame if someone were to make a post with their office locations across Europe and share it in all the European communities on Lemmy…

andrew0@lemmy.dbzer0.com · 3 months ago

What’s the plan against this? It’s pretty clear that this type of grifting works. Hungary kept Orban in power for far too long, and now Romania might be next.

andrew0@lemmy.dbzer0.com · 3 months ago

Romania has previously jumped into a war, only to change sides later. I wouldn’t be surprised if they end up taking the bait before the upcoming presidential elections. From what I’ve heard, the far right candidate that’s left in the race is betting that he will get the votes of everyone that voted for Calin Georgescu. His platform? Being a boot licker for Trump.

Troubling times for the Balkans.

andrew0@lemmy.dbzer0.com · 3 months ago

It was just announced that the EU is pausing sustainability requirements on smaller business (< 500 employees) for 2 years. This stems from fears related to the trade war, as they want to keep smaller businesses competitive. Nevertheless, I’m pretty sure that this won’t be great for the environment.

andrew0@lemmy.dbzer0.com · 3 months ago

For notes, I have moved to Joplin with the option to synchronize my data using a WebDAV server. It works really well, and it has both a mobile and desktop app. If you’re interested in developing your project, maybe you can have a look at the options this provides. For example, I really like the ability to separate notes between groups, assign tags, create drawings, and the possibility to use Markdown.

Good luck with your projects! To mirror @enemenemu’s suggestion, I would also look into collaborating with the people trying to push the EU Docs alternative. Not sure if that will work, but it’s worth a shot if you’re interested :D

andrew0@lemmy.dbzer0.com · 4 months ago

Mine’s just one I got from a random kid name generator.

A bit off-topic: not sure why, but I keep seeing posts here on Lemmy lately about Romanian women pulling the short end of the stick in terms of gender equality. I hope I’m not offending in any way with this question, but is Romania sticking to the traditional gender roles?

andrew0@lemmy.dbzer0.com · 4 months ago

Fooyin is also a solid choice.

andrew0@lemmy.dbzer0.com · edit-2 4 months ago

Advancing European Sovereignty in HPC with RISC-V

andrew0@lemmy.dbzer0.com · 4 months ago

Oh, I don’t know how I forgot about this. I have already signed it last year haha. Thanks for reminding me!

andrew0@lemmy.dbzer0.com · 4 months ago

Well, what I’m thinking about is not too far from education. I am suggesting that we have independent fact-checkers, or at least tools that show all the angles of a certain issue (e.g., something like Ground News, but not owned by a for-profit organisation), paid by tax money. This should be incorporated in something like an API that Fediverse instances could tap into. Again, not governments deciding who is right or who is wrong, but citizen-backed initiatives that work for the people. There should be open source plugins that could be used by fedi instances to relay the fact-checking or other relevant information.

I am categorizing this as governmental regulation because the tax money is allocated by the government specifically for content “moderation”. However, this doesn’t mean that content should be removed from social media just because it talks about a topic (unless it is illegal), but people should at least have additional information available for free that they could research further. And no, I don’t think the community notes employed by Meta and Twitter are enough, as we’ve seen how that went for the Americans in the last election.

andrew0@lemmy.dbzer0.com · edit-2 4 months ago

Mate, I am not advocating here for the EU to break E2EE, nor support linking of your social media profile to a citizen number that can be used against you. Similarly, I do not wish for the EU to start generating its own propaganda machine to replace the US one.

I am merely stating that we should invest more in EU open source, promote more fact checking and open algorithms (or even banning them) for social media. What is happening with Twitter, Facebook and TikTok is not ok. We do not need social media to profile us and push content that a state or rich person deems necessary for their benefit. Aren’t you on Lemmy specifically because of that?

andrew0@lemmy.dbzer0.com · edit-2 4 months ago

Thank you for the information. I’ll keep that in mind for the future! To be honest, I intended to find a petition on this topic and share that to incentivize some mobilization, but I could not find anything. This was the most recent article I could find on the topic, given my limited time to research. If you have a suggestion on an article, I would love to change it to that!

andrew0@lemmy.dbzer0.com · edit-2 4 months ago

EU Digital Sovereignty - Time to provide alternatives to US/Chinese big tech

andrew0@lemmy.dbzer0.com · edit-2 4 months ago

It’s a bit short-sighted to say that Trump is the one calling in shots here, specifically to weaken the US. It is pretty clear that he is following the plan put forward by the Heritage Foundation word by word. If I understood correctly, the idea is to make the American economy more resilient at the expense of all of its (poor) citizens. Once that is done, they can then leverage their safe zone to further influence policies in other countries. For example, get the EU to lower regulations, so American companies can extract more wealth.

Here is a quote from the actual “Project 2025 Mandate for Leadership” PDF:

Needed reforms

[…]

Increase allied conventional defense burden-sharing. U.S. allies must take far greater responsibility for their conventional defense. U.S. allies must play their part not only in dealing with China, but also in dealing with threats from Russia, Iran, and North Korea.

Make burden-sharing a central part of U.S. defense strategy with the United States not just helping allies to step up, but strongly encouraging them to do so.

Support greater spending and collaboration by Taiwan and allies in the Asia–Pacific like Japan and Australia to create a collective defense model.

Transform NATO so that U.S. allies are capable of fielding the great majority of the conventional forces required to deter Russia while relying on the United States primarily for our nuclear deterrent, and select other capabilities while reducing the U.S. force posture in Europe.

Sustain support for Israel even as America empowers Gulf partners to take responsibility for their own coastal, air, and missile defenses both individually and working collectively.

Enable South Korea to take the lead in its conventional defense against North Korea.

[…]

They are engineering most of these situations that we’ve seen in the media specifically to make the ideas more digestible to the average population. See the Zelenskyy case: “This is going to be great television” - the guy is not even hiding it.

On one hand, Taiwan is right to say that the US won’t abandon them. The US does not produce enough chips locally to just let them get gobbled up by China. However, this sort of “theatrics” is not over, and they will come up with a reason to scare Taiwan into investing a lot more in defence, specifically to prepare them for a fight to destabilize China.

It’s truly sad that this administration is now in power to push these ideas. The average American is going to become much poorer and hateful due to all protections previously put in place being dismantled. Hopefully people wake up and kick them out of office, but the damage done to foreign relationships is already done.

andrew0@lemmy.dbzer0.com · 5 months ago

Precisely. The only enemy that the US conservative party sees is China. Everyone else is a business partner that they must strong arm into favourable deals for the US.

andrew0@lemmy.dbzer0.com · edit-2 5 months ago

We can all help Ukraine - UNITED24

andrew0@lemmy.dbzer0.com · 6 months ago

Archiving papers using Zotero headless?

andrew0@lemmy.dbzer0.com · edit-2 10 months ago

Redox OS 0.9.0 - Redox - Your Next(Gen) OS

andrew0@lemmy.dbzer0.com · 2 years ago

Poll: GUI framework for widgets/apps in Wayland

andrew0@lemmy.dbzer0.com · edit-2 2 years ago

Jump from Arch to NixOS?

andrew0@lemmy.dbzer0.com · edit-2 2 years ago

Sites or Trackers for Exam Dumps

andrew0