-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
【Hackathon 6th Article No.6】稀疏计算的使用指南 #880
Conversation
rfcs/Article/guide_to_use_sparse.md
Outdated
|
||
可见,ResNet 稀疏网络的代码和常规 ResNet 网络代码几乎没有差别。通过增加 import 路径替换,原网络代码基本都无需改动。通过 `from paddle.sparse import nn`,则可保持与原来的`nn.*`写法一致,更易于上手。 | ||
|
||
## 3. 3D点云 CenterPoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
改成:Paddle 稀疏计算实战案例 吧
可见,ResNet 稀疏网络的代码和常规 ResNet 网络代码几乎没有差别。通过增加 import 路径替换,原网络代码基本都无需改动。通过 `from paddle.sparse import nn`,则可保持与原来的`nn.*`写法一致,更易于上手。 | ||
|
||
## 3. 3D点云 CenterPoint | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
介绍一下,将以 3D点云CenterPoint 为例来介绍
rfcs/Article/guide_to_use_sparse.md
Outdated
PaddlePaddle支持的主要稀疏格式包括: | ||
|
||
- COO格式(Coordinate Format):用坐标表示非零元素的稀疏矩阵,包括三个数组:行索引、列索引和值。 | ||
- CSR格式(Compressed Sparse Row):将稀疏矩阵的行压缩,以节省存储空间。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
包含 几个数组 介绍下
```python | ||
import numpy as np | ||
import paddle.sparse as sparse | ||
def random_sparse_tensor(shape, density, sparse_type='coo'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个可以写简单一些吗,直接paddle.rand 然后随机mask或dropout,然后在 to_sparse_coo/to_sparse_csr 就可以
@zhwesky2010 已经全部按照要求修改了,麻烦研发老师再看看 |
rfcs/Article/guide_to_use_sparse.md
Outdated
- COO格式(Coordinate Format):用坐标表示非零元素的稀疏矩阵,包括三个数组:行索引、列索引和值。 | ||
- CSR格式(Compressed Sparse Row):将稀疏矩阵的行压缩,以节省存储空间。 | ||
- COO格式(Coordinate Format):此格式使用坐标来表示稀疏矩阵中的非零元素,涉及三个数组:行索引、列索引和值。 | ||
- CSR格式(Compressed Sparse Row):此格式通过压缩稀疏矩阵的行来节省存储空间。它包含三个数组,即Index Pointers,indices和Data数组。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
行指针信息、列坐标和值
rfcs/Article/guide_to_use_sparse.md
Outdated
@@ -15,15 +15,22 @@ | |||
|
|||
PaddlePaddle支持的主要稀疏格式包括: | |||
|
|||
- COO格式(Coordinate Format):用坐标表示非零元素的稀疏矩阵,包括三个数组:行索引、列索引和值。 | |||
- CSR格式(Compressed Sparse Row):将稀疏矩阵的行压缩,以节省存储空间。 | |||
- COO格式(Coordinate Format):此格式使用坐标来表示稀疏矩阵中的非零元素,涉及三个数组:行索引、列索引和值。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
行坐标、列坐标和值
rfcs/Article/guide_to_use_sparse.md
Outdated
@@ -84,6 +91,12 @@ CSR格式也存储三个数组,分别是Index Pointers,indices以及Data数 | |||
|
|||
例如,第一个index pointers对是`[0,2]`,那么这是表示稀疏矩阵第0行(0在Index Pointers数组中的索引是0)中元素的信息,并且表示第0行中共有两个非零元素。而使用`Indices[0,2]`可以获得这两个元素的列索引,使用`Data[0,2]`获取这两个元素具体的值。 | |||
|
|||
在`paddle.sparse`的CSR实现中,我们也使用了三个列表: | |||
|
|||
- 一维列表`crows`对应于Index Pointers数组; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
rfcs/Article/guide_to_use_sparse.md
Outdated
|
||
COO格式: | ||
|
||
![COO Matrix](images/coo.gif) | ||
|
||
COO格式存储三个数组:行索引(Row)、列索引(Column)以及值(Data)数组。使用Data数组中的元素的索引分别去访问Row数组和Column数组就可以得到该元素在原来矩阵中的位置。 | ||
|
||
在`paddle.sparse`的COO实现中,使用了两个列表: | ||
|
||
- 二维列表`indices`,包含行索引和列索引; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
rfcs/Article/guide_to_use_sparse.md
Outdated
|
||
dense_tensor = paddle.randn(shape) | ||
dropout = paddle.nn.Dropout(p=density) | ||
dense_tensor = dropout(dense_tensor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
paddle.nn.functional.dropout
rfcs/Article/guide_to_use_sparse.md
Outdated
if sparse_type == 'coo': | ||
return sparse.sparse_coo_tensor(indices, values.tolist(), shape) | ||
sparse_tensor = dense_tensor.to_sparse_coo(sparse_dim=dense_tensor.dim()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sparse_dim用默认的就可以吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sparse_dim用默认的就可以吧
@zhwesky2010
如下图所示,我没有给定sparse_dim参数的值,代码报错了:
我的paddle版本为稳定版2.6.1。
@zhwesky2010 |
rfcs/Article/guide_to_use_sparse.md
Outdated
@@ -15,29 +15,31 @@ | |||
|
|||
PaddlePaddle支持的主要稀疏格式包括: | |||
|
|||
- COO格式(Coordinate Format):此格式使用坐标来表示稀疏矩阵中的非零元素,涉及三个数组:行索引、列索引和值。 | |||
- CSR格式(Compressed Sparse Row):此格式通过压缩稀疏矩阵的行来节省存储空间。它包含三个数组,即Index Pointers,indices和Data数组。 | |||
- COO格式(Coordinate Format):此格式使用坐标来表示稀疏矩阵中的非零元素,涉及三个数组:行坐标、列坐标和值数组。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
包含三个数组
rfcs/Article/guide_to_use_sparse.md
Outdated
|
||
* Index Pointers数组中相邻的两个元素可以确定两个信息。 | ||
* 指针对中第一个元素的坐标r是稀疏矩阵行号。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这一句可以删掉
rfcs/Article/guide_to_use_sparse.md
Outdated
|
||
* 其次,这些值表示Indices数组的 [start: stop] 切片,它们的差是每行中非零元素的个数。使用指针查找索引以确定数据中每个元素的列。 | ||
* 若a不等于b,可以用a和b两个值构造一个切片[a:b],那么使用该切片访问列坐标数组就可以得到第r行中非零元素的列坐标;若使用该切片访问值数组就可以得到第r行中非零元素的值。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
写简单一点,若a不等于b,那么 indices[a: b]就是第r行中非零元素的列坐标、Data[a: b] 就是第r行中非零元素的值
rfcs/Article/guide_to_use_sparse.md
Outdated
|
||
CSR格式也存储三个数组,分别是行指针信息(Index Pointers),列坐标(indices)以及值(Data)数组。 | ||
|
||
* 行指针信息数组中相邻的两个元素,假设它们的坐标分别是r和r+1,值分别为a和b,那么可以确定: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
行指针信息数组记录了....,假设其中两个相邻的元素坐标分别是...
* Indices数组记录了每个元素的列索引。 | ||
* Data数组记录了元素的值 | ||
* 列坐标数组记录了非零元素的列坐标。 | ||
* 值数组记录了非零元素的值 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个例子就不需要了,上面已经通过r、r+1举例
@zhwesky2010 已经按照要求修改了,麻烦研发老师再看看 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
No description provided.