Skip to content

Commit

Permalink
Added documentation for new functions and corrections to accomplish PEP8
Browse files Browse the repository at this point in the history
  • Loading branch information
Roberto Zamora-Zamora committed Oct 31, 2015
1 parent 6b0cfc4 commit e6122ef
Show file tree
Hide file tree
Showing 3 changed files with 291 additions and 259 deletions.
42 changes: 36 additions & 6 deletions doc/source/driver.rst
Expand Up @@ -337,7 +337,7 @@ Constants

.. attribute:: CONTEXT
MEMORY_TYPE
DEVICE_POINTER
DEVICE_POINTER
HOST_POINTER

CUDA 4.0 and above.
Expand Down Expand Up @@ -928,7 +928,7 @@ Global Device Memory

A base class that facilitates casting to pointers within PyCUDA.
This allows the user to construct custom pointer types that may
have been allocated by facilities outside of PyCUDA proper, but
have been allocated by facilities outside of PyCUDA proper, but
still need to be objects to facilitate RAII. The user needs to
supply one method to facilitate the pointer cast:

Expand Down Expand Up @@ -1076,7 +1076,7 @@ Post-Allocation Pagelocking
.. method:: unregister()

Unregister the page-lock on the host memory held by this instance.
Note that this does not free the memory, it only frees the
Note that this does not free the memory, it only frees the
page-lock.

.. attribute:: base
Expand All @@ -1103,7 +1103,7 @@ CUDA 6.0 adds support for a "Unified Memory" model, which creates a managed
virtual memory space that is visible to both CPUs and GPUs. The OS will
migrate the physical pages associated with managed memory between the CPU and
GPU as needed. This allows a numpy array on the host to be passed to kernels
without first creating a DeviceAllocation and manually copying the host data
without first creating a DeviceAllocation and manually copying the host data
to and from the device.

.. note::
Expand Down Expand Up @@ -1222,7 +1222,7 @@ an explicit copy::
The CUDA Unified Memory model has very specific rules regarding concurrent
access of managed memory allocations. Host access to any managed array
is not allowed while the GPU is executing a kernel, regardless of whether
the array is in use by the running kernel. Failure to follow the
the array is in use by the running kernel. Failure to follow the
concurrency rules will generate a segmentation fault, *causing the Python
interpreter to terminate immediately*.

Expand Down Expand Up @@ -1390,6 +1390,36 @@ Arrays and Textures
it is `"C"`, then `tex2D(x,y)` is going to fetch `matrix[y,x]`,
and vice versa for for `"F"`.

.. function:: np_to_array(nparray, order, allowSurfaceBind=False)

Turn a :class:`numpy.ndarray` with 2D or 3D structure, into an
:class:`Array`.
The `order` argument can be either `"C"` or `"F"`.
If `allowSurfaceBind` is passed as *True* the returned :class:`Array`
can be read and write with :class:`SurfaceReference` in addition to
:class:`TextureReference`.
Function automatically detect `dtype` and adjust channels to
supported available :class:`array_format`. Also includes direct support
for `np.float64`, `np.complex64` and `np.complex128` formats, similar
to :meth:`bind_to_texref_ext`.

.. versionadded:: 2015.1

.. function:: gpuarray_to_array(gpuparray, order, allowSurfaceBind=False)

Turn a :class:`GPUArray` with 2D or 3D structure, into an
:class:`Array`.
The `order` argument can be either `"C"` or `"F"`.
If `allowSurfaceBind` is passed as *True* the returned :class:`Array`
can be read and write with :class:`SurfaceReference` in addition to
:class:`TextureReference`.
Function automatically detect `dtype` and adjust channels to
supported available :class:`array_format`. Also includes direct support
for `np.float64`, `np.complex64` and `np.complex128` formats, similar
to :meth:`bind_to_texref_ext`.

.. versionadded:: 2015.1

.. function:: make_multichannel_2d_array(matrix, order)

Turn the three-dimensional :class:`numpy.ndarray` object *matrix* into
Expand Down Expand Up @@ -1603,7 +1633,7 @@ Structured Memory Transfers

.. class:: Memcpy3DPeer()

:class:`Memcpy3DPeer` has the same members as :class:`Memcpy3D`,
:class:`Memcpy3DPeer` has the same members as :class:`Memcpy3D`,
and additionally all of the following:

.. method:: set_src_context(ctx)
Expand Down
160 changes: 79 additions & 81 deletions pycuda/driver.py
Expand Up @@ -725,26 +725,25 @@ def matrix_to_array(matrix, order, allow_double_hack=False):
return ary

def np_to_array(nparray, order, allowSurfaceBind=False):

case = order in ["C","F"]
if not case:
raise LogicError("order must be either F or C")
raise LogicError("order must be either F or C")

dimension = len(nparray.shape)
if dimension == 2:
if order == "C": stride = 0
if order == "F": stride = -1
h, w = nparray.shape
d = 1
if allowSurfaceBind:
descrArr = ArrayDescriptor3D()
descrArr.width = w
descrArr.height = h
descrArr.depth = d
else:
descrArr = ArrayDescriptor()
descrArr.width = w
descrArr.height = h
if order == "C": stride = 0
if order == "F": stride = -1
h, w = nparray.shape
d = 1
if allowSurfaceBind:
descrArr = ArrayDescriptor3D()
descrArr.width = w
descrArr.height = h
descrArr.depth = d
else:
descrArr = ArrayDescriptor()
descrArr.width = w
descrArr.height = h
elif dimension == 3:
if order == "C": stride = 1
if order == "F": stride = 1
Expand All @@ -754,65 +753,64 @@ def np_to_array(nparray, order, allowSurfaceBind=False):
descrArr.height = h
descrArr.depth = d
else:
raise LogicError("CUDArray dimensions 2 and 3 supported in CUDA at the moment ... ")
raise LogicError("CUDArrays dimensions 2 or 3 supported in CUDA at the moment ... ")

if nparray.dtype == np.complex64:
descrArr.format = array_format.SIGNED_INT32 # Reading data as int2 (hi=re,lo=im) structure
descrArr.num_channels = 2
descrArr.format = array_format.SIGNED_INT32 # Reading data as int2 (hi=re,lo=im) structure
descrArr.num_channels = 2
elif nparray.dtype == np.float64:
descrArr.format = array_format.SIGNED_INT32 # Reading data as int2 (hi,lo) structure
descrArr.num_channels = 2
descrArr.format = array_format.SIGNED_INT32 # Reading data as int2 (hi,lo) structure
descrArr.num_channels = 2
elif nparray.dtype == np.complex128:
descrArr.format = array_format.SIGNED_INT32 # Reading data as int4 (re=(hi,lo),im=(hi,lo)) structure
descrArr.num_channels = 4
descrArr.format = array_format.SIGNED_INT32 # Reading data as int4 (re=(hi,lo),im=(hi,lo)) structure
descrArr.num_channels = 4
else:
descrArr.format = dtype_to_array_format(nparray.dtype)
descrArr.num_channels = 1
descrArr.format = dtype_to_array_format(nparray.dtype)
descrArr.num_channels = 1

if allowSurfaceBind:
if dimension==2: descrArr.flags |= array3d_flags.ARRAY3D_LAYERED
descrArr.flags |= array3d_flags.SURFACE_LDST
if dimension==2: descrArr.flags |= array3d_flags.ARRAY3D_LAYERED
descrArr.flags |= array3d_flags.SURFACE_LDST

cudaArray = Array(descrArr)
if allowSurfaceBind or dimension==3:
copy3D = Memcpy3D()
copy3D.set_src_host(nparray)
copy3D.set_dst_array(cudaArray)
copy3D.width_in_bytes = copy3D.src_pitch = nparray.strides[stride]
copy3D.src_height = copy3D.height = h
copy3D.depth = d
copy3D()
return cudaArray
copy3D = Memcpy3D()
copy3D.set_src_host(nparray)
copy3D.set_dst_array(cudaArray)
copy3D.width_in_bytes = copy3D.src_pitch = nparray.strides[stride]
copy3D.src_height = copy3D.height = h
copy3D.depth = d
copy3D()
return cudaArray
else:
copy2D = Memcpy2D()
copy2D.set_src_host(nparray)
copy2D.set_dst_array(cudaArray)
copy2D.width_in_bytes = copy2D.src_pitch = nparray.strides[stride]
copy2D.src_height = copy2D.height = h
copy2D(aligned=True)
return cudaArray
copy2D = Memcpy2D()
copy2D.set_src_host(nparray)
copy2D.set_dst_array(cudaArray)
copy2D.width_in_bytes = copy2D.src_pitch = nparray.strides[stride]
copy2D.src_height = copy2D.height = h
copy2D(aligned=True)
return cudaArray

def gpuarray_to_array(gpuarray, order, allowSurfaceBind=False):

case = order in ["C","F"]
if not case:
raise LogicError("order must be either F or C")
raise LogicError("order must be either F or C")

dimension = len(gpuarray.shape)
if dimension == 2:
if order == "C": stride = 0
if order == "F": stride = -1
h, w = gpuarray.shape
d = 1
if allowSurfaceBind:
descrArr = ArrayDescriptor3D()
descrArr.width = int(w)
descrArr.height = int(h)
descrArr.depth = int(d)
else:
descrArr = ArrayDescriptor()
descrArr.width = int(w)
descrArr.height = int(h)
if order == "C": stride = 0
if order == "F": stride = -1
h, w = gpuarray.shape
d = 1
if allowSurfaceBind:
descrArr = ArrayDescriptor3D()
descrArr.width = int(w)
descrArr.height = int(h)
descrArr.depth = int(d)
else:
descrArr = ArrayDescriptor()
descrArr.width = int(w)
descrArr.height = int(h)
elif dimension == 3:
if order == "C": stride = 1
if order == "F": stride = 1
Expand All @@ -825,40 +823,40 @@ def gpuarray_to_array(gpuarray, order, allowSurfaceBind=False):
raise LogicError("CUDArray dimensions 2 and 3 supported in CUDA at the moment ... ")

if gpuarray.dtype == np.complex64:
descrArr.format = array_format.SIGNED_INT32 # Reading data as int2 (hi=re,lo=im) structure
descrArr.num_channels = 2
descrArr.format = array_format.SIGNED_INT32 # Reading data as int2 (hi=re,lo=im) structure
descrArr.num_channels = 2
elif gpuarray.dtype == np.float64:
descrArr.format = array_format.SIGNED_INT32 # Reading data as int2 (hi,lo) structure
descrArr.num_channels = 2
descrArr.format = array_format.SIGNED_INT32 # Reading data as int2 (hi,lo) structure
descrArr.num_channels = 2
elif gpuarray.dtype == np.complex128:
descrArr.format = array_format.SIGNED_INT32 # Reading data as int4 (re=(hi,lo),im=(hi,lo)) structure
descrArr.num_channels = 4
descrArr.format = array_format.SIGNED_INT32 # Reading data as int4 (re=(hi,lo),im=(hi,lo)) structure
descrArr.num_channels = 4
else:
descrArr.format = dtype_to_array_format(gpuarray.dtype)
descrArr.num_channels = 1
descrArr.format = dtype_to_array_format(gpuarray.dtype)
descrArr.num_channels = 1

if allowSurfaceBind:
if dimension==2: descrArr.flags |= array3d_flags.ARRAY3D_LAYERED
descrArr.flags |= array3d_flags.SURFACE_LDST
if dimension==2: descrArr.flags |= array3d_flags.ARRAY3D_LAYERED
descrArr.flags |= array3d_flags.SURFACE_LDST

cudaArray = Array(descrArr)
if allowSurfaceBind or dimension==3:
copy3D = Memcpy3D()
copy3D.set_src_device(gpuarray.ptr)
copy3D.set_dst_array(cudaArray)
copy3D.width_in_bytes = copy3D.src_pitch = gpuarray.strides[stride]
copy3D.src_height = copy3D.height = int(h)
copy3D.depth = int(d)
copy3D()
return cudaArray
copy3D = Memcpy3D()
copy3D.set_src_device(gpuarray.ptr)
copy3D.set_dst_array(cudaArray)
copy3D.width_in_bytes = copy3D.src_pitch = gpuarray.strides[stride]
copy3D.src_height = copy3D.height = int(h)
copy3D.depth = int(d)
copy3D()
return cudaArray
else:
copy2D = Memcpy2D()
copy2D.set_src_device(gpuarray.ptr)
copy2D.set_dst_array(cudaArray)
copy2D.width_in_bytes = copy2D.src_pitch = gpuarray.strides[stride]
copy2D.src_height = copy2D.height = int(h)
copy2D(aligned=True)
return cudaArray
copy2D = Memcpy2D()
copy2D.set_src_device(gpuarray.ptr)
copy2D.set_dst_array(cudaArray)
copy2D.width_in_bytes = copy2D.src_pitch = gpuarray.strides[stride]
copy2D.src_height = copy2D.height = int(h)
copy2D(aligned=True)
return cudaArray

def make_multichannel_2d_array(ndarray, order):
"""Channel count has to be the first dimension of the C{ndarray}."""
Expand Down

0 comments on commit e6122ef

Please sign in to comment.