[Feature] Idea: Add "official" web search UI #4627

JanPokorny · 2022-01-06T19:26:41Z

There are already some web UIs to search Scoop packages, like my favorite https://shovel.sh (no affiliation with the so-named fork) or https://rasa.github.io/scoop-directory/search that searches more buckets (which may not necessarily be a good thing?). They simplify work by searching more buckets (not necessarily just the locally added ones) and offering better UX (direct links to manifest, app website etc.).

I think that it would be a good idea to create a "canonical" one under this org (to centralize development) and feature it on the https://scoop.sh website. This would also serve as a nice showcase for Scoop, since any potential new user arriving on the website can check what apps are available to install.

(I would create the issue in the website's repo, but I couldn't find it. Please respond with the link if I'm just being blind/dumb.)

rashil2000 · 2022-01-06T19:33:00Z

There have been many such efforts in the past, but none of them is official as of now. Since all of them work pretty nicely (and each with a different method/scope), it has not been a priority to merge it here. Though if anyone of the authors is willing to merge it here, PRs are most welcome!

(I would create the issue in the website's repo, but I couldn't find it. Please respond with the link if I'm just being blind/dumb.)

Website gets served from this repo's gh-pages branch, so no different repo.

JanPokorny · 2022-01-06T19:44:10Z

While centralizing development is not necessary, at least putting a link/searchbar on the main page seems like a beneficial step. Ping @mertd -- would you be ok with a link (or a <form>) on scoop.sh leading to shovel.sh?

rashil2000 · 2022-01-06T19:48:37Z

Putting a link to a single search utility would give users the impression that it is Scoop's official search implementation. We would certainly not like to give preference to any community search utility. That is, unless the author is willing to merge his implementation in the upstream.

HUMORCE · 2022-01-06T20:13:06Z

github advanced search is my first choice for searching manifests, then extras/everything😂

rashil2000 · 2022-01-06T20:20:03Z

Well, if we're talking about personal use - I wrote a tiny PowerShell script to search for manifests, as I don't really like leaving the terminal 😜

https://github.com/rashil2000/scripts/blob/main/Find-Scoop.ps1

mertd · 2022-01-12T23:30:21Z

Thank you for the the shout-out @JanPokorny! Of course I would not have any issues with the shovel search being linked or integrated on scoop.sh.

I also understand your concerns @rashil2000. Some ideas to address these (may be one or a combination of these):

compare existing web search utilities transparently and embrace one
mark the link/integration explicitly as a community effort
link multiple community sourced search tools (this may include CLI tools)
integrate upstream as per your suggestion -- will need some discussion with the author for future development
- assimilate fully and add author to scoop.sh team?
- add submodule and glue code?

rasa · 2022-01-13T01:07:52Z

And we even could split this up into

A back-end crawler that generates a db,
A CLI search tool that queries the db,
A web search tool that queries the db

I'm not attached to scoop-directory being chosen, but it, or ScoopSearch seem to be good candidates for 1.

Edit: shovel would also be a fine choice for 1. Yet all of these are not build on top of Powershell, which powers Scoop. Does that matter? In my opinion, no.

rashil2000 · 2022-01-13T04:37:52Z

Since shovel.sh searches the known buckets, it fits nicely into the "official" tool criteria.

integrate upstream as per your suggestion -- will need some discussion with the author for future development

assimilate fully and add author to scoop.sh team?

I think this is how we can go about it.

Transfer the repos shovel and shovel-data into ScoopInstaller org. Note that the author @mertd will continue to have full admin rights over these repos.
Rename them to scoop.sh and scoop.sh-data respectively.
In the shovel.sh website, add a homepage (currently it takes directly to search form). This will serve as the new homepage for Scoop.
Finally, modify the DNS settings for scoop.sh domain to point to ScoopInstaller/scoop.sh@master instead of current ScoopInstaller/Scoop@gh-pages. This step needs to be done by @lukesampson since they own the domain.

rashil2000 · 2022-01-13T14:13:24Z

or ScoopSearch seem to be good candidates for 1.

We should rope in @gpailler into the conversation too. ScoopSearch is as a good a candidate as shovel.sh, and it sits in a convenient GitHub org of its own.

The procedure for merging upstream would be roughly the same as above.

mertd · 2022-01-13T22:40:30Z

I understand that the only change you would require to be made to code or other content would be to integrate the current contents of the scoop.sh front page. The author of the integrated web search (be that @gpailler or me) will retain copyright and ownership (under the respective license) of the transferred repositories, but should discuss major changes with other members of the @ScoopInstaller organization. Is this how you envisioned it @rashil2000?

If so, then I think I like this approach. The original goal for shovel.sh was to offer search functionality for scoop.sh the way formulae.brew.sh does for brew.sh. With this and some work to make the manifest pages indexable for external search engines (mertd/shovel#10), that goal would be fulfilled.

rashil2000 · 2022-01-14T02:44:34Z

Yes, precisely

rasa · 2022-01-14T04:57:47Z

Can we discuss the benefits and drawbacks of each option before we commit to a particular tool? For example, I think there are many benefits to the back-end crawler creating an indexed database that can be quickly searched. This seems preferable to a large json object. I guess it could create both, if we wanted, but the db would be preferable, imo.

JanPokorny · 2022-01-14T07:46:19Z

If you create an SQLite database, you can even quickly query it from the frontend https://phiresky.github.io/blog/2021/hosting-sqlite-databases-on-github-pages/

rashil2000 · 2022-01-14T07:57:37Z

If you create an SQLite database, you can even quickly query it from the frontend https://phiresky.github.io/blog/2021/hosting-sqlite-databases-on-github-pages/

Yes, this is what scoop-directory uses :)

Although the DB file isn't that big - just around 7-8MB

gpailler · 2022-01-14T14:16:39Z

Thanks for your interest in ScoopSearch guys 😄

It would be great to have an official web search-engine for Scoop packages and I'm willing to help on that and to transfer the repos if it makes sense to you.

ScoopSearch backend is hosted on Azure (Azure Search and Azure Functions) and costs USD0/month (Free Tier). Even with more traffic, I think that we will stick to the Free Tier. I can transfer/configure the backend to an "official" scoop.sh Azure subscription too.

I also checked quickly, and we can dump the full search index to JSON if we want to provide online and offline search capabilities.

Merging all our projects and providing a unified solution to search the Scoop packages is definitely the best solution for the community.

So up to you now !

rashil2000 · 2022-01-14T14:34:43Z

I also checked quickly, and we can dump the full search index to JSON if we want to provide online and offline search capabilities.

Just a quick question, how large would that JSON be? shovel.sh only indexes known buckets, and the single-line JSON there is already 4.81MB.

mertd · 2022-01-15T03:35:03Z

Just a quick question, how large would that JSON be? shovel.sh only indexes known buckets, and the single-line JSON there is already 4.81MB.

The shovel.sh JSON was so large because I didn't filter out keys from the manifest files that are not needed for searching (mertd/shovel-data#9). After filtering, the file is down to 1.16MB.

I believe there are benefits to each of the approaches chosen by @gpailler, @rasa and I respectively. For shovel.sh, I went with an approach targeting simplicity and least cost: Generate the index as part of a scheduled GitHub pipeline and host it and the web app statically. Of course this has drawbacks too; executing the actual search on a back end will make search performance less dependent on the client.

rashil2000 · 2022-01-15T04:09:34Z

I see.

gpailler · 2022-01-15T07:17:06Z

Just a quick question, how large would that JSON be? shovel.sh only indexes known buckets, and the single-line JSON there is already 4.81MB.

I checked this morning and I ended up with a JSON file of 9.95MB for ~15,300 documents in the index. The reason for this "small" size is that the manifests are parsed and only the relevant information are added to the index.

For example, only the following content is stored in the index for the 7-zip manifest

{
   "Id":"adde431fdac84b7bbf54205c3ef58594fef42a5d",
   "Name":"7zip",
   "NameSortable":"7zip",
   "NamePartial":"7zip",
   "NameSuffix":"7zip",
   "Description":"A multi-format file archiver with high compression ratios",
   "Homepage":"https://www.7-zip.org/",
   "License":"Freeware,LGPL-2.0-only,BSD-3-Clause",
   "Version":"21.07",
   "Metadata":{
      "Repository":"https://github.com/ScoopInstaller/Main",
      "OfficialRepository":true,
      "OfficialRepositoryNumber":1,
      "RepositoryStars":850,
      "BranchName":"master",
      "FilePath":"bucket/7zip.json",
      "AuthorName":"github-actions[bot]",
      "AuthorMail":"41898282\u002Bgithub-actions[bot]@users.noreply.github.com",
      "Committed":"2021-12-27T12:30:02Z",
      "Sha":"bcaca41c8cb6ca07841d4bacd722986c1e894609"
   }
}

As @mertd said, parsing the manifests adds some complexity but it was required with my approach as I had to populate the Azure Search index properly and keep the index size under control (the Free Tier limit is 50MB).

rashil2000 · 2022-01-15T09:18:20Z

a JSON file of 9.95MB for ~15,300 documents in the index

That seems pretty reasonable to me, thanks for the info!

JanPokorny · 2022-01-15T18:35:24Z

Integrating CLI search in this seems like a difficult problem when you think about it. The CLI might have non-public buckets added (thus need local indexing). Also if the database is statically hosted and queried on the client, re-downloading the database from the backend crawler every time it is outdated (which will happen often) might be slower than the current scoop search implementation. Potentially a backend like the Azure Search could be used to speed up queries from the CLI (and most importantly, query not-added buckets), but a hybrid approach incorporating any local non-public non-indexed buckets would be needed either way.

rashil2000 · 2022-01-15T19:01:19Z

The CLI might have non-public buckets added (thus need local indexing).
Potentially a backend like the Azure Search could be used to speed up queries from the CLI (and most importantly, query not-added buckets), but a hybrid approach incorporating any local non-public non-indexed buckets would be needed either way.

I feel we shouldn't need to worry about local indexing as of now, as scoop search already handles it.

The CLI @rasa is talking about would probably be a separate search tool with a non-PowerShell implementation.

Also if the database is statically hosted and queried on the client, re-downloading the database from the backend crawler every time it is outdated (which will happen often) might be slower than the current scoop search.

We could set a time interval for this, like 4 hours (or maybe a day). Quite a few tools (like tealdeer) do this. It doesn't really add much delay.

For instance, the little search tool I mentioned in #4627 (comment) downloads the DB file if it's older than a day, and the 7-8MB file does not take more than a couple of seconds. A one time updation like this can be tolerated IMO.

rasa · 2022-01-17T05:23:45Z

That seems to be how winget works, it downloads the database whenever it senses it's out of date.

We should compress the .json blob, and/or sqlite .db, to speed downloading.

rashil2000 · 2022-01-22T13:07:16Z

@rasa Should we start a poll, to vote on? I'll tag in the active maintainers and some recent active contributors.

rasa · 2022-01-22T17:11:07Z

A poll is a good idea, but I'm not sure how to structure it. My thought is that there are really three (or four) parts of our search functionality:

Back-end crawler
Web front-end
CLI front-end
GUI front-end
And each could be independent. Perhaps a poll for each? And maybe have a option to develop something from scratch, such as implementing a back-end crawler in rust, go, or powershell? Or am I overthinking?

rashil2000 · 2022-01-22T17:21:16Z

This issue concerns only the "web" part, i.e. Back-end crawler and Web front-end, both of which already exist (in some form) as scoop-directory, shovel.sh and ScoopSearch. So I was thinking of a poll between these three.

I don't think many people have tried making a GUI for Scoop (- A command line installer). Nevertheless, this is separate from the website component and is being tracked here - #4660

Similarly, for CLI utilities, we can track a separate issue (given that there are already 2 good options - https://github.com/shilangyu/scoop-search and https://github.com/tokiedokie/scoop-search - which can be extended to search the website's JSON too.)

rashil2000 · 2022-02-01T18:51:28Z

I have created a poll. Please vote!

The outcome of the poll will undergo the rough procedure described in #4627 (comment) to get integrated into Scoop upstream.

I am tagging some recent/frequent contributors/maintainers (in no particular order). Your feedback is valuable!

@ScoopInstaller/maintainers

@tech189 @littleli @igitur @hu3rror @Erisa @Lutra-Fs @segevfiner @LazyGeniusMan @Slach @jcwillox @phanirithvij @AntonOks @RavenMacDaddy @sitiom @wenmin92 @TheRandomLabs @amreus

(I just went through the recent merged PRs and picked these names. If your name isn't there, that doesn't mean you can't vote!)

rashil2000 · 2022-02-01T18:52:11Z

You can also comment below to tell why you chose an option. I'll start.

ScoopSearch, because:

Searches practically all manifests on GitHub (close to 20k). With an optional (and very useful!) toggle to filter official manifests.
Sort results by match, name, date
Search by bucket
Includes some helpful info/links in each result - license, last updated date, last committer, bucket, popularity etc.
Directly copy code to add bucket or install an app
List all community buckets

rasa · 2022-02-01T23:14:04Z

Based on preliminary results, it looks like we're going with ScoopSearch. That sounds great! Let me know how I can support ScoopSearch moving forward. I will keep scoop-directory up for the foreseeable future, but will direct users to use ScoopSearch as the "official" search engine. Thanks to everyone for voting, and your past support!

rashil2000 · 2022-05-03T20:12:01Z

The brand new website for Scoop is up 🎉🎊

JanPokorny added the enhancement label Jan 6, 2022

rashil2000 mentioned this issue Jan 14, 2022

Toggle for filtering results from known buckets ScoopInstaller/scoopinstaller.github.io#1

Closed

rashil2000 pinned this issue Feb 1, 2022

This comment was marked as resolved.

Sign in to view

mertd mentioned this issue Mar 8, 2022

Suggestions, more info details mertd/shovel#48

Closed

rashil2000 mentioned this issue Mar 9, 2022

[Request] add scoop-sd ScoopInstaller/Extras#8127

Closed

rashil2000 mentioned this issue Mar 21, 2022

[Feature] Website UI Overhaul #4827

Closed

hgkamath mentioned this issue Apr 26, 2022

[Feature] Make scoop info accept pipeline input as object-stream in the powershell spirit #4889

Open

rashil2000 closed this as completed May 3, 2022

rashil2000 unpinned this issue May 12, 2022

daveneeley mentioned this issue Aug 8, 2022

Single repository for all plugins asdf-vm/asdf-plugins#612

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Idea: Add "official" web search UI #4627

[Feature] Idea: Add "official" web search UI #4627

JanPokorny commented Jan 6, 2022

rashil2000 commented Jan 6, 2022

JanPokorny commented Jan 6, 2022 •

edited

Loading

rashil2000 commented Jan 6, 2022

HUMORCE commented Jan 6, 2022

rashil2000 commented Jan 6, 2022

mertd commented Jan 12, 2022

rasa commented Jan 13, 2022 •

edited

Loading

rashil2000 commented Jan 13, 2022 •

edited

Loading

rashil2000 commented Jan 13, 2022 •

edited

Loading

mertd commented Jan 13, 2022

rashil2000 commented Jan 14, 2022

rasa commented Jan 14, 2022 •

edited

Loading

JanPokorny commented Jan 14, 2022

rashil2000 commented Jan 14, 2022

gpailler commented Jan 14, 2022

rashil2000 commented Jan 14, 2022

mertd commented Jan 15, 2022 •

edited

Loading

rashil2000 commented Jan 15, 2022

gpailler commented Jan 15, 2022

rashil2000 commented Jan 15, 2022

JanPokorny commented Jan 15, 2022

rashil2000 commented Jan 15, 2022

rasa commented Jan 17, 2022

rashil2000 commented Jan 22, 2022

rasa commented Jan 22, 2022

rashil2000 commented Jan 22, 2022

rashil2000 commented Feb 1, 2022 •

edited

Loading

rashil2000 commented Feb 1, 2022 •

edited

Loading

rasa commented Feb 1, 2022 •

edited

Loading

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

rashil2000 commented May 3, 2022

[Feature] Idea: Add "official" web search UI #4627

[Feature] Idea: Add "official" web search UI #4627

Comments

JanPokorny commented Jan 6, 2022

rashil2000 commented Jan 6, 2022

JanPokorny commented Jan 6, 2022 • edited Loading

rashil2000 commented Jan 6, 2022

HUMORCE commented Jan 6, 2022

rashil2000 commented Jan 6, 2022

mertd commented Jan 12, 2022

rasa commented Jan 13, 2022 • edited Loading

rashil2000 commented Jan 13, 2022 • edited Loading

rashil2000 commented Jan 13, 2022 • edited Loading

mertd commented Jan 13, 2022

rashil2000 commented Jan 14, 2022

rasa commented Jan 14, 2022 • edited Loading

JanPokorny commented Jan 14, 2022

rashil2000 commented Jan 14, 2022

gpailler commented Jan 14, 2022

rashil2000 commented Jan 14, 2022

mertd commented Jan 15, 2022 • edited Loading

rashil2000 commented Jan 15, 2022

gpailler commented Jan 15, 2022

rashil2000 commented Jan 15, 2022

JanPokorny commented Jan 15, 2022

rashil2000 commented Jan 15, 2022

rasa commented Jan 17, 2022

rashil2000 commented Jan 22, 2022

rasa commented Jan 22, 2022

rashil2000 commented Jan 22, 2022

rashil2000 commented Feb 1, 2022 • edited Loading

rashil2000 commented Feb 1, 2022 • edited Loading

rasa commented Feb 1, 2022 • edited Loading

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

rashil2000 commented May 3, 2022

JanPokorny commented Jan 6, 2022 •

edited

Loading

rasa commented Jan 13, 2022 •

edited

Loading

rashil2000 commented Jan 13, 2022 •

edited

Loading

rashil2000 commented Jan 13, 2022 •

edited

Loading

rasa commented Jan 14, 2022 •

edited

Loading

mertd commented Jan 15, 2022 •

edited

Loading

rashil2000 commented Feb 1, 2022 •

edited

Loading

rashil2000 commented Feb 1, 2022 •

edited

Loading

rasa commented Feb 1, 2022 •

edited

Loading