-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Idea: Add "official" web search UI #4627
Comments
There have been many such efforts in the past, but none of them is official as of now. Since all of them work pretty nicely (and each with a different method/scope), it has not been a priority to merge it here. Though if anyone of the authors is willing to merge it here, PRs are most welcome!
Website gets served from this repo's |
While centralizing development is not necessary, at least putting a link/searchbar on the main page seems like a beneficial step. Ping @mertd -- would you be ok with a link (or a |
Putting a link to a single search utility would give users the impression that it is Scoop's official search implementation. We would certainly not like to give preference to any community search utility. That is, unless the author is willing to merge his implementation in the upstream. |
github advanced search is my first choice for searching manifests, then |
Well, if we're talking about personal use - I wrote a tiny PowerShell script to search for manifests, as I don't really like leaving the terminal 😜 https://github.com/rashil2000/scripts/blob/main/Find-Scoop.ps1 |
Thank you for the the shout-out @JanPokorny! Of course I would not have any issues with the shovel search being linked or integrated on scoop.sh. I also understand your concerns @rashil2000. Some ideas to address these (may be one or a combination of these):
|
And we even could split this up into
I'm not attached to scoop-directory being chosen, but it, or ScoopSearch seem to be good candidates for 1. Edit: shovel would also be a fine choice for 1. Yet all of these are not build on top of Powershell, which powers Scoop. Does that matter? In my opinion, no. |
Since shovel.sh searches the known buckets, it fits nicely into the "official" tool criteria.
I think this is how we can go about it.
|
We should rope in @gpailler into the conversation too. ScoopSearch is as a good a candidate as shovel.sh, and it sits in a convenient GitHub org of its own. The procedure for merging upstream would be roughly the same as above. |
I understand that the only change you would require to be made to code or other content would be to integrate the current contents of the scoop.sh front page. The author of the integrated web search (be that @gpailler or me) will retain copyright and ownership (under the respective license) of the transferred repositories, but should discuss major changes with other members of the @ScoopInstaller organization. Is this how you envisioned it @rashil2000? If so, then I think I like this approach. The original goal for shovel.sh was to offer search functionality for scoop.sh the way formulae.brew.sh does for brew.sh. With this and some work to make the manifest pages indexable for external search engines (mertd/shovel#10), that goal would be fulfilled. |
Yes, precisely |
Can we discuss the benefits and drawbacks of each option before we commit to a particular tool? For example, I think there are many benefits to the back-end crawler creating an indexed database that can be quickly searched. This seems preferable to a large json object. I guess it could create both, if we wanted, but the db would be preferable, imo. |
If you create an SQLite database, you can even quickly query it from the frontend https://phiresky.github.io/blog/2021/hosting-sqlite-databases-on-github-pages/ |
Yes, this is what scoop-directory uses :) Although the DB file isn't that big - just around 7-8MB |
Thanks for your interest in ScoopSearch guys 😄 It would be great to have an official web search-engine for Scoop packages and I'm willing to help on that and to transfer the repos if it makes sense to you. ScoopSearch backend is hosted on Azure (Azure Search and Azure Functions) and costs USD0/month (Free Tier). Even with more traffic, I think that we will stick to the Free Tier. I can transfer/configure the backend to an "official" scoop.sh Azure subscription too. I also checked quickly, and we can dump the full search index to JSON if we want to provide online and offline search capabilities. Merging all our projects and providing a unified solution to search the Scoop packages is definitely the best solution for the community. So up to you now ! |
Just a quick question, how large would that JSON be? shovel.sh only indexes known buckets, and the single-line JSON there is already 4.81MB. |
The shovel.sh JSON was so large because I didn't filter out keys from the manifest files that are not needed for searching (mertd/shovel-data#9). After filtering, the file is down to 1.16MB. I believe there are benefits to each of the approaches chosen by @gpailler, @rasa and I respectively. For shovel.sh, I went with an approach targeting simplicity and least cost: Generate the index as part of a scheduled GitHub pipeline and host it and the web app statically. Of course this has drawbacks too; executing the actual search on a back end will make search performance less dependent on the client. |
I see. |
I checked this morning and I ended up with a JSON file of 9.95MB for ~15,300 documents in the index. The reason for this "small" size is that the manifests are parsed and only the relevant information are added to the index. For example, only the following content is stored in the index for the 7-zip manifest
As @mertd said, parsing the manifests adds some complexity but it was required with my approach as I had to populate the Azure Search index properly and keep the index size under control (the Free Tier limit is 50MB). |
That seems pretty reasonable to me, thanks for the info! |
Integrating CLI search in this seems like a difficult problem when you think about it. The CLI might have non-public buckets added (thus need local indexing). Also if the database is statically hosted and queried on the client, re-downloading the database from the backend crawler every time it is outdated (which will happen often) might be slower than the current |
I feel we shouldn't need to worry about local indexing as of now, as The CLI @rasa is talking about would probably be a separate search tool with a non-PowerShell implementation.
We could set a time interval for this, like 4 hours (or maybe a day). Quite a few tools (like tealdeer) do this. It doesn't really add much delay. For instance, the little search tool I mentioned in #4627 (comment) downloads the DB file if it's older than a day, and the 7-8MB file does not take more than a couple of seconds. A one time updation like this can be tolerated IMO. |
That seems to be how winget works, it downloads the database whenever it senses it's out of date. We should compress the .json blob, and/or sqlite .db, to speed downloading. |
@rasa Should we start a poll, to vote on? I'll tag in the active maintainers and some recent active contributors. |
A poll is a good idea, but I'm not sure how to structure it. My thought is that there are really three (or four) parts of our search functionality:
|
This issue concerns only the "web" part, i.e. Back-end crawler and Web front-end, both of which already exist (in some form) as scoop-directory, shovel.sh and ScoopSearch. So I was thinking of a poll between these three. I don't think many people have tried making a GUI for Scoop (- A command line installer). Nevertheless, this is separate from the website component and is being tracked here - #4660 Similarly, for CLI utilities, we can track a separate issue (given that there are already 2 good options - https://github.com/shilangyu/scoop-search and https://github.com/tokiedokie/scoop-search - which can be extended to search the website's JSON too.) |
I have created a poll. Please vote! The outcome of the poll will undergo the rough procedure described in #4627 (comment) to get integrated into Scoop upstream. I am tagging some recent/frequent contributors/maintainers (in no particular order). Your feedback is valuable! @ScoopInstaller/maintainers @tech189 @littleli @igitur @hu3rror @Erisa @Lutra-Fs @segevfiner @LazyGeniusMan @Slach @jcwillox @phanirithvij @AntonOks @RavenMacDaddy @sitiom @wenmin92 @TheRandomLabs @amreus (I just went through the recent merged PRs and picked these names. If your name isn't there, that doesn't mean you can't vote!) |
You can also comment below to tell why you chose an option. I'll start. ScoopSearch, because:
|
Based on preliminary results, it looks like we're going with ScoopSearch. That sounds great! Let me know how I can support ScoopSearch moving forward. I will keep scoop-directory up for the foreseeable future, but will direct users to use ScoopSearch as the "official" search engine. Thanks to everyone for voting, and your past support! |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
The brand new website for Scoop is up 🎉🎊 |
There are already some web UIs to search Scoop packages, like my favorite https://shovel.sh (no affiliation with the so-named fork) or https://rasa.github.io/scoop-directory/search that searches more buckets (which may not necessarily be a good thing?). They simplify work by searching more buckets (not necessarily just the locally added ones) and offering better UX (direct links to manifest, app website etc.).
I think that it would be a good idea to create a "canonical" one under this org (to centralize development) and feature it on the https://scoop.sh website. This would also serve as a nice showcase for Scoop, since any potential new user arriving on the website can check what apps are available to install.
(I would create the issue in the website's repo, but I couldn't find it. Please respond with the link if I'm just being blind/dumb.)
The text was updated successfully, but these errors were encountered: