How to implement GPU memory recycling in CUDA C++ for data streaming in TensorFlow? - Stack Overflow
I have to decide on the specification of a project for my HPC course, which involves optimizing GPU memory usage in a data streaming context. Specifically, I aim to implement a mechanism for recycling allocated memory on the GPU to improve efficiency while processing a stream of input data.
I've been considering TensorFlow as the framework for this task because of its built-in support for GPU operators and i was wondering if TensorFlow's API includes features to simulate or handle streaming. However, I'm unsure how to approach the problem of memory recycling in this context.
Here are my specific questions:
- Memory Recycling in TensorFlow: Does TensorFlow have built-in tools or patterns for recycling GPU memory during continuous data processing, or would i need to implement custom solutions? The scope of my project is to implement CUDA C++ code, so i'm particularly interested in whether TensorFlow lacks a solution for handling GPU memory recycling in contexts where the input is a data stream(e.g., Sparse matrices or other data structures where dimensions significantly impact performance).
- Custom GPU Operators: If I need to create custom GPU operators to manage memory more efficiently, how should I approach this in TensorFlow? Are there resources or examples for implementing such custom operators?
- Profiling Memory Usage: What are the best practices for profiling and monitoring GPU memory usage in TensorFlow and CUDA, especially when working with data streams? The goal is to otpimize GPU memory usage and minimize the impact of the PCIe transfer bottleneck. I am considering to use nvprof and its graphical version for profiling CUDA execution.
- Streaming in TensorFlow: Does TensorFlow provide APIs for handling streaming inputs of data? or perhaps tools to emulate this behavior?
If TensorFlow isn't the best choice for this type of project, I would also appreciate suggestions for alternative frameworks or tools that might be better suited for GPU memory management in a streaming context. To provide some context, this project has already been implemented in WindFlow library. My professor and i were discussing the possibility of implementing this feature in another streaming tool like Flink, but Flink doesn't support GPU "operators". As a result, the scope of the project following that path might become too large for an exam worth only 9 CFUs
I apologize in advance if my question seems somewhat vague; I am currently navigating a phase filled with ambiguity and multiple potential directions. Any guidance, references, or sample code to get started with this would be greatly appreciated!
- 下个月Win7正式“退休”,数据显示国内近60%电脑用户仍在使用
- 拍照比“剪刀手”注意了!专家:会泄露指纹信息
- 微软服软涉足iOS、安卓背后:以开发者为重
- OS X故障不断 苹果MAC被爆Wifi故障
- 80后回忆录 那些年我们折腾过的IT玩意
- java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.SparkThrowableHelper$ - Stack Overflow
- c - How to properly implement HKDF Expand with openssl EVP_KDF - Stack Overflow
- java - I have an issue with tmcbeans, I can not run projects - Stack Overflow
- rust - Why is the compiler asking for Sized, when I already added it? - Stack Overflow
- Flutter GoRouter: ShellRoute with Subroutes - Stack Overflow
- Java Zxing Datamatrix Code Not Scanning Datamatrixes - Stack Overflow
- c# - Querying CosmosDB_SQL from ASP.NET Core 8 Web API to return Single value - Stack Overflow
- c# - Ravendb - calling SaveChanges took too much time - Stack Overflow
- c++ - How to create a class setup where two classes each holds an instance of the other? - Stack Overflow
- python - NaN loss output when training DNN with Keras - Stack Overflow
- postgresql - How do I connect my AIRFLOW which is installed on WSL to POSTGRES DATABASE which is installed on windows environmen
- imagemagick - How to add annotations in Right-To-Left (RTL) languages (like Arabic and Persian) to images using R's magi