Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add wordcloud #1552

Merged
merged 16 commits into from
Dec 13, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions tools/wordcloud/.shed.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
name: wordcloud
owner: bgruening
description: A little word cloud generator in Python.
long_description: |
The wordcloud library allows you to create word clouds from text data.
It is highly customizable and can generate word clouds in various shapes and colors.
The wordcloud library is available as an open-source project on GitHub.
remote_repository_url: https://github.com/bgruening/galaxytools/tree/master/tools/wordcloud
homepage_url: https://github.com/amueller/word_cloud
type:
categories:
- Visualization
30 changes: 30 additions & 0 deletions tools/wordcloud/macros.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
<macros>
<token name="@TOOL_VERSION@">1.9.4</token>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If its just one tool, you can move this into the tool if you like - not as a separate file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If its just one tool, you can move this into the tool if you like - not as a separate file.

Thanks for your comment. The files were updated based on your suggestion.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you misunderstood: using the tokens instead of hard-coded versions is important for automated update suggestions, but you can set the macros right in the tool xml instead of expanding them from another file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you misunderstood: using the tokens instead of hard-coded versions is important for automated update suggestions, but you can set the macros right in the tool xml instead of expanding them from another file.

Thanks for your comment. I updated the file.

<token name="@VERSION_SUFFIX@">0</token>
<token name="@PROFILE@">23.0</token>
<xml name="requirements">
<requirement type="package" version="1.9.4">wordcloud</requirement>
<requirement type="package" version="2.7.0">codecov</requirement>
<requirement type="package" version="7.3.2">coverage</requirement>
<requirement type="package" version="3.8.0">flake8</requirement>
<requirement type="package" version="7.4.3">pytest</requirement>
<requirement type="package" version="4.1.0">pytest-cov</requirement>
<requirement type="package" version="0.9.7">pytest-sugar</requirement>
<requirement type="package" version="28.0.0">setuptools</requirement>
<requirement type="package" version="4.0.2">twine</requirement>
<requirement type="package" version="0.38.1">wheel</requirement>
<requirement type="package" version="1.5.1">numpydoc</requirement>
<requirement type="package" version="2.31.1">imageio</requirement>
<requirement type="package" version="7.2.3">sphinx</requirement>
<requirement type="package" version="1.2.2">sphinx_rtd_theme</requirement>
<requirement type="package" version="0.8.2">sphinx_gallery</requirement>
<requirement type="package" version="0.4.0">sphinx-argparse</requirement>
<requirement type="package" version="3.0.1">sphinx-issues</requirement>
<requirement type="package" version="3.8.0">matplotlib</requirement>
<requirement type="package" version="6.0.4">multidict</requirement>
<requirement type="package" version="0.42.1">jieba</requirement>
<requirement type="package" version="1.11.3">scipy</requirement>
<requirement type="package" version="2.0.2">python-bidi</requirement>
<requirement type="package" version="3.0.0">arabic_reshaper</requirement>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this all needed? I would assume that only wordcloud as dependency works?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this all needed? I would assume that only wordcloud as dependency works?

Yes, it works with only Wordclous as a requirement.

</xml>
</macros>
Binary file added tools/wordcloud/test-data/DroidSansMono.ttf
Binary file not shown.
Binary file added tools/wordcloud/test-data/colormask.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added tools/wordcloud/test-data/font.ttf
Binary file not shown.
Binary file added tools/wordcloud/test-data/mask.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions tools/wordcloud/test-data/regxp.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
\w[\w']+
7 changes: 7 additions & 0 deletions tools/wordcloud/test-data/stopwords.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
is
an
for
the
open
source
platform
1 change: 1 addition & 0 deletions tools/wordcloud/test-data/test.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Galaxy is an open source, web-based platform for data intensive biomedical research. Galaxy contains more than 800 different single analysis tools and ready-to-use pipelines for different applications.
210 changes: 210 additions & 0 deletions tools/wordcloud/wordcloud.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,210 @@
<tool id="wordcloud" name="wordcloud" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@" license="MIT">
<description>A little word cloud generator in Python.</description>
reytakop marked this conversation as resolved.
Show resolved Hide resolved
<macros>
<import>macros.xml</import>
</macros>
<requirements>
<expand macro="requirements"/>
</requirements>
<command detect_errors="exit_code"><![CDATA[
mkdir -p 'output' &&
wordcloud_cli --text $text
#if str($regexp) != '':
--regexp $regexp
#end if
#if str($stopwords) != '':
--stopwords $stopwords
#end if
#if str($fontfile) != '':
--fontfile $fontfile
#end if
#if str($mask) != '':
--mask $mask
#end if
#if str($colormask) != '':
--colormask $colormask
#end if
--contour_width $contour_width
--contour_color $contour_color
--relative_scaling $relative_scaling
--margin $margin
--width $width
--height $height
--color $color
--background $background
#if $no_collocations:
--no_collocations
#end if
#if $include_numbers:
--include_numbers
#end if
#if str($min_word_length) != '':
--min_word_length $min_word_length
#end if
#if str($prefer_horizontal) != '':
--prefer_horizontal $prefer_horizontal
#end if
#if str($scale) != '':
--scale $scale
#end if
#if str($colormap) != '':
--colormap $colormap
#end if
#if str($mode) != '':
--mode $mode
#end if
#if str($max_words) != '':
--max_words $max_words
#end if
#if str($min_font_size) != '':
--min_font_size $min_font_size
#end if
#if str($max_font_size) != '':
--max_font_size $max_font_size
#end if
#if str($font_step) != '':
--font_step $font_step
#end if
#if str($random_state) != '':
--random_state $random_state
#end if
#if $no_normalize_plurals:
--no_normalize_plurals
#end if
#if $repeat:
--repeat
#end if
#if $version:
--version
#end if
]]></command>
<inputs>
<conditional name="input_source">
<param name="select_file" type="select" label="Input from txt file">
<option value="true" selected="true">Use txt</option>
<option value="false">Provide sequence as text-field</option>
</param>
<when value="true">
<param format="txt" name="txt_input" type="data" label=" Enter your text file (Txt file)"/>
</when>
<when value="false">
<param name="input_sequence" type="text" label="Type your text here" help="Enter your text here"/>
</when>
</conditional>
<param argument="--regexp" type="data" format="txt" optional="True" label="Regular expression to filter words" help="regular expression to filter words"/>
<param argument="--stopwords" type="data" format="txt" optional="True" label="Stopwords file" help="Specify file of stopwords (containing one word per line) to remove from the given text after parsing"/>
<param argument="--fontfile" type="data" format="ttf" label="Font file you wish to use " help=" The font file you want to use"/>
<param argument="--min_font_size" value="4" type="integer" label=" Smallest font size to use" optional="False"/>
<param argument="--max_font_size" type="integer" label="Maximum font size for the largest word" optional="False"/>
<param argument="--font_step" value="1" type="integer" label="Step size for the font" help="Font_step &gt; 1 might speed up computation but give a worse fit"/>
<param argument="--margin" value="2" type="integer" label="Spacing to leave around words"/>
<param argument="--color" type="color" label="Use given color as coloring for the image">
<validator type="regex" message="Please select a valid RGB color">[#][0-9A-Fa-f]{6}</validator>
</param>
<param argument="--background" type="color" label="Use given color as background color for the image" value="#000000">
<validator type="regex" message="Please select a valid RGB color">[#][0-9A-Fa-f]{6}</validator>
</param>
<param argument="--mask" type="data" format="png" optional="True" label="Mask to use for the image form"/>
<param argument="--colormask" type="data" format="png" optional="True" label="Color mask to use for image coloring"/>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here are many "color" options, should they be in your color-conditional?

<param argument="--contour_width" value="0" min="0" type="float" label="Contour width" help="Use given color as mask contour color"/>
<param argument="--contour_color" type="color" label="Contour color" value="#000000">
<validator type="regex" message="Please select a valid RGB color">[#][0-9A-Fa-f]{6}</validator>
</param>
<param argument="--relative_scaling" value="0.0" min="0.0" max="1.0" type="float" label="Scaling of words by frequency (0 - 1)"/>
<param argument="--no_collocations" type="boolean" truevalue="True" falsevalue="False" value="True" label="Do not add collocations (bigrams) to word cloud"/>
<param argument="--include_numbers" type="boolean" truevalue="True" falsevalue="False" value="False" label="Whether to include numbers as phrases or not"/>
<param argument="--min_word_length" value="0" type="integer" label="Minimum number of letters a word must have to be included"/>
<param argument="--prefer_horizontal" value="0.9" type="float" label="Ratio of times to try horizontal fitting as opposed to vertical" min="0" max="1"/>
<param argument="--scale" value="1.0" type="float" label="Scaling between computation and drawing"/>
<param argument="--colormap" type="select" label="Matplotlib colormap name" value="viridis">
<option value="viridis">viridis</option>
<option value="plasma">plasma</option>
<option value="inferno">inferno</option>
<option value="magma">magma</option>
<option value="cividis">cividis</option>
<option value="Greys">Greys</option>
<option value="Purples">Purples</option>
<option value="Blues">Blues</option>
<option value="Greens">Greens</option>
<option value="Oranges">Oranges</option>
<option value="Reds">Reds</option>
<option value="YlOrBr">YlOrBr</option>
<option value="YlOrRd">YlOrRd</option>
<option value="OrRd">OrRd</option>
<option value="PuRd">PuRd</option>
<option value="RdPu">RdPu</option>
<option value="BuPu">BuPu</option>
<option value="GnBu">GnBu</option>
<option value="PuBu">PuBu</option>
<option value="YlGnBu">YlGnBu</option>
<option value="PuBuGn">PuBuGn</option>
<option value="BuGn">BuGn</option>
<option value="YlGn">YlGn</option>
</param>
<param argument="--mode" type="select" label="Use RGB or RGBA for transparent background" value="RGB">
<option value="RGB">RGB</option>
<option value="RGBA">RGBA</option>
</param>
<param argument="--max_words" value="200" type="integer" label="Maximum number of words"/>
<param argument="--width" value="400" type="integer" label="Define output image width"/>
<param argument="--height" type="integer" value="200" label="Define output image height"/>
<param argument="--random_state" type="integer" label="Random seed"/>
<param argument="--no_normalize_plurals" value="True" type="boolean" optional="True" label="Whether to remove trailing s from words"/>
<param argument="--repeat" type="boolean" truevalue="True" falsevalue="False" value="False" label="Whether to repeat words and phrases until max_words or min_font_size is reached"/>
<param argument="--version" type="boolean" truevalue="True" falsevalue="False" value="False" label="Show program's version number"/>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

version should not be offered to the user, but you should add a <version_command> tag to the tool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

version should not be offered to the user, but you should add a <version_command> tag to the tool.

Thanks, I deleted this line.

</inputs>
<outputs>
<data name="output_image" format="png" from_work_dir="output/wordcloud.png" label="${tool.name} on ${on_string}.png">
<filter>'png' in output_format</filter>
</data>
</outputs>
<tests>
<test expect_num_outputs="1">
<param name="select_file" value="true"/>
<param name="txt_input" value="test.txt"/>
<param name="regexp" value="regxp.txt"/>
<param name="stopwords" value="stopwords.txt"/>
<param name="fontfile" value="font.ttf"/>
<param name="mask" value="mask.png"/>
<param name="colormask" value="colormask.png"/>
<param name="contour_width" value="1"/>
<param name="contour_color" value="#000000"/>
<param name="relative_scaling" value="1"/>
<param name="margin" value="1"/>
<param name="width" value="400"/>
<param name="height" value="400"/>
<param name="color" value="#898989"/>
<param name="background" value="#116787"/>
<param name="no_collocations" value="False"/>
<param name="include_numbers" value="True"/>
<param name="min_word_length" value="1"/>
<param name="prefer_horizontal" value="0.5"/>
<param name="scale" value="1"/>
<param name="colormap" value="viridis"/>
<param name="mode" value="RGB"/>
<param name="max_words" value="100"/>
<param name="min_font_size" value="1"/>
<param name="max_font_size" value="100"/>
<param name="font_step" value="1"/>
<param name="random_state" value="10"/>
<param name="no_normalize_plurals" value="False"/>
<param name="repeat" value="False"/>
<param name="version" value="True"/>
<output name="output_image" file="output/wordcloud.png"/>
</test>
</tests>
<help><![CDATA[
A little word cloud generator in Python.
A word cloud is a visual representation (image) of word data. In other words, it is a collection, or cluster, of words depicted in different sizes. The bigger and bolder the word appears, the more often it's mentioned within a given text and the more important it is.
]]></help>
reytakop marked this conversation as resolved.
Show resolved Hide resolved
<citations>
<citation type="bibtex">
@misc{amueller2018wordcloud,
title={Word Clouds with Python},
author={Amueller, Sebastian},
year={2018},
url={https://amueller.github.io/word_cloud/}
}
</citation>
</citations>
</tool>
Loading