VPI - Vision Programming Interface

3.2 版本发布

FFT

概述

FFT 应用程序输出输入图像的频谱表示,并将其保存到磁盘上的图像文件中。您可以定义用于处理的后端。

指令

命令行参数为

<后端> <输入图像>

其中

  • 后端:cpucuda;它定义了将执行处理的后端。
  • 输入图像:输入图像文件名;它接受 png、jpeg 以及可能的其他格式。

这是一个示例

  • C++
    ./vpi_sample_07_fft cuda ../assets/kodim08.png
  • Python
    python3 main.py cuda ../assets/kodim08.png

这是使用 CUDA 后端和提供的示例图像之一。您可以尝试其他图像,但需遵守算法施加的约束。

结果

输入图像输出图像,频谱

源代码

为方便起见,以下代码也安装在示例目录中。

语言
27 import sys
28 import vpi
29 import numpy as np
30 from PIL import Image
31 from argparse import ArgumentParser
32 
33 # ----------------------------
34 # Parse command line arguments
35 
36 parser = ArgumentParser()
37 parser.add_argument('backend', choices=['cpu','cuda'],
38  help='Backend to be used for processing')
39 
40 parser.add_argument('input',
41  help='Input image in space domain')
42 
43 args = parser.parse_args();
44 
45 if args.backend == 'cpu'
46  backend = vpi.Backend.CPU
47 else
48  assert args.backend == 'cuda'
49  backend = vpi.Backend.CUDA
50 
51 # --------------------------------------------------------------
52 # Load input into a vpi.Image and convert it to float grayscale
53 with vpi.Backend.CUDA
54  try
55  input = vpi.asimage(np.asarray(Image.open(args.input))).convert(vpi.Format.F32)
56  except IOError
57  sys.exit("Input file not found")
58  except
59  sys.exit("Error with input file")
60 
61 # --------------------------------------------------------------
62 # Transform input into frequency domain
63 with backend
64  hfreq = input.fft()
65 
66 # --------------------------------------------------------------
67 # Post-process results and save to disk
68 
69 # Transform [H,W,2] float array into [H,W] complex array
70 hfreq = hfreq.cpu().view(dtype=np.complex64).squeeze(2)
71 
72 # Complete array into a full hermitian matrix
73 if input.width%2==0
74  wpad = input.width//2-1
75  padmode = 'reflect'
76 else
77  wpad = input.width//2
78  padmode='symmetric'
79 freq = np.pad(hfreq, ((0,0),(0,wpad)), mode=padmode)
80 freq[:,hfreq.shape[1]:] = np.conj(freq[:,hfreq.shape[1]:])
81 freq[1:,hfreq.shape[1]:] = freq[1:,hfreq.shape[1]:][::-1]
82 
83 # Shift 0Hz to image center
84 freq = np.fft.fftshift(freq)
85 
86 # Convert complex frequencies into log-magnitude
87 lmag = np.log(1+np.absolute(freq))
88 
89 # Normalize into [0,255] range
90 min = lmag.min()
91 max = lmag.max()
92 lmag = ((lmag-min)*255/(max-min)).round().astype(np.uint8)
93 
94 # -------------------
95 # Save result to disk
96 Image.fromarray(lmag).save('spectrum_python'+str(sys.version_info[0])+'_'+args.backend+'.png')
29 #include <opencv2/core/version.hpp>
30 #include <opencv2/imgproc/imgproc.hpp>
31 #if CV_MAJOR_VERSION >= 3
32 # include <opencv2/imgcodecs.hpp>
33 #else
34 # include <opencv2/highgui/highgui.hpp>
35 #endif
36 
37 #include <vpi/OpenCVInterop.hpp>
38 
39 #include <vpi/Image.h>
40 #include <vpi/Status.h>
41 #include <vpi/Stream.h>
43 #include <vpi/algo/FFT.h>
44 
45 #include <cstring> // for memset
46 #include <iostream>
47 #include <sstream>
48 
49 #define CHECK_STATUS(STMT) \
50  do \
51  { \
52  VPIStatus status = (STMT); \
53  if (status != VPI_SUCCESS) \
54  { \
55  char buffer[VPI_MAX_STATUS_MESSAGE_LENGTH]; \
56  vpiGetLastStatusMessage(buffer, sizeof(buffer)); \
57  std::ostringstream ss; \
58  ss << vpiStatusGetName(status) << ": " << buffer; \
59  throw std::runtime_error(ss.str()); \
60  } \
61  } while (0);
62 
63 // Auxiliary functions to process spectrum before saving it to disk.
64 cv::Mat LogMagnitude(cv::Mat cpx);
65 cv::Mat CompleteFullHermitian(cv::Mat in, cv::Size fullSize);
66 cv::Mat InplaceFFTShift(cv::Mat mag);
67 
68 int main(int argc, char *argv[])
69 {
70  // OpenCV image that will be wrapped by a VPIImage.
71  // Define it here so that it's destroyed *after* wrapper is destroyed
72  cv::Mat cvImage;
73 
74  // VPI objects that will be used
75  VPIImage image = NULL;
76  VPIImage imageF32 = NULL;
77  VPIImage spectrum = NULL;
78  VPIStream stream = NULL;
79  VPIPayload fft = NULL;
80 
81  int retval = 0;
82 
83  try
84  {
85  // =============================
86  // Parse command line parameters
87 
88  if (argc != 3)
89  {
90  throw std::runtime_error(std::string("Usage: ") + argv[0] + " <cpu|cuda> <input image>");
91  }
92 
93  std::string strBackend = argv[1];
94  std::string strInputFileName = argv[2];
95 
96  // Now parse the backend
97  VPIBackend backend;
98 
99  if (strBackend == "cpu")
100  {
101  backend = VPI_BACKEND_CPU;
102  }
103  else if (strBackend == "cuda")
104  {
105  backend = VPI_BACKEND_CUDA;
106  }
107  else
108  {
109  throw std::runtime_error("Backend '" + strBackend + "' not recognized, it must be either cpu or cuda.");
110  }
111 
112  // =====================
113  // Load the input image
114 
115  cvImage = cv::imread(strInputFileName);
116  if (cvImage.empty())
117  {
118  throw std::runtime_error("Can't open '" + strInputFileName + "'");
119  }
120 
121  // =================================
122  // Allocate all VPI resources needed
123 
124  // Create the stream for the given backend.
125  CHECK_STATUS(vpiStreamCreate(backend, &stream));
126 
127  // We now wrap the loaded image into a VPIImage object to be used by VPI.
128  // VPI won't make a copy of it, so the original
129  // image must be in scope at all times.
130  CHECK_STATUS(vpiImageCreateWrapperOpenCVMat(cvImage, 0, &image));
131 
132  // Temporary image that holds the float version of input
133  CHECK_STATUS(vpiImageCreate(cvImage.cols, cvImage.rows, VPI_IMAGE_FORMAT_F32, 0, &imageF32));
134 
135  // Now create the output image. Note that for real inputs, the output spectrum is a Hermitian
136  // matrix (conjugate-symmetric), so only the non-redundant components are output, basically the
137  // left half. We adjust the output width accordingly.
138  CHECK_STATUS(vpiImageCreate(cvImage.cols / 2 + 1, cvImage.rows, VPI_IMAGE_FORMAT_2F32, 0, &spectrum));
139 
140  // Create the FFT payload that does real (space) to complex (frequency) transformation
141  CHECK_STATUS(
142  vpiCreateFFT(backend, cvImage.cols, cvImage.rows, VPI_IMAGE_FORMAT_F32, VPI_IMAGE_FORMAT_2F32, &fft));
143 
144  // ================
145  // Processing stage
146 
147  // Convert image to float
148  CHECK_STATUS(vpiSubmitConvertImageFormat(stream, backend, image, imageF32, NULL));
149 
150  // Submit it for processing passing the image to be gradient and the result image
151  CHECK_STATUS(vpiSubmitFFT(stream, backend, fft, imageF32, spectrum, 0));
152 
153  // Wait until the algorithm finishes processing
154  CHECK_STATUS(vpiStreamSync(stream));
155 
156  // =======================================
157  // Output processing and saving it to disk
158 
159  // Lock output image to retrieve its data on cpu memory
160  VPIImageData outData;
161  CHECK_STATUS(vpiImageLockData(spectrum, VPI_LOCK_READ, VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR, &outData));
162 
164  VPIImageBufferPitchLinear &outPitch = outData.buffer.pitch;
165 
166  assert(outPitch.format == VPI_IMAGE_FORMAT_2F32);
167 
168  // Wrap spectrum to be used by OpenCV
169  cv::Mat cvSpectrum(outPitch.planes[0].height, outPitch.planes[0].width, CV_32FC2, outPitch.planes[0].data,
170  outPitch.planes[0].pitchBytes);
171 
172  // Process it
173  cv::Mat mag = InplaceFFTShift(LogMagnitude(CompleteFullHermitian(cvSpectrum, cvImage.size())));
174 
175  // Normalize the result to fit in 8-bits
176  normalize(mag, mag, 0, 255, cv::NORM_MINMAX);
177 
178  // Write to disk
179  imwrite("spectrum_" + strBackend + ".png", mag);
180 
181  // Done handling output image, don't forget to unlock it.
182  CHECK_STATUS(vpiImageUnlock(spectrum));
183  }
184  catch (std::exception &e)
185  {
186  std::cerr << e.what() << std::endl;
187  retval = 1;
188  }
189 
190  // ========
191  // Clean up
192 
193  // Make sure stream is synchronized before destroying the objects
194  // that might still be in use.
195  if (stream != NULL)
196  {
197  vpiStreamSync(stream);
198  }
199 
200  vpiImageDestroy(image);
201  vpiImageDestroy(imageF32);
202  vpiImageDestroy(spectrum);
203  vpiStreamDestroy(stream);
204 
205  // Payload is owned by the stream, so it's already destroyed
206  // since the stream is now destroyed.
207 
208  return retval;
209 }
210 
211 // Auxiliary functions --------------------------------
212 
213 cv::Mat LogMagnitude(cv::Mat cpx)
214 {
215  // Split spectrum into real and imaginary parts
216  cv::Mat reim[2];
217  assert(cpx.channels() == 2);
218  split(cpx, reim);
219 
220  // Calculate the magnitude
221  cv::Mat mag;
222  magnitude(reim[0], reim[1], mag);
223 
224  // Convert to logarithm scale
225  mag += cv::Scalar::all(1);
226  log(mag, mag);
227  mag = mag(cv::Rect(0, 0, mag.cols & -2, mag.rows & -2));
228 
229  return mag;
230 }
231 
232 cv::Mat CompleteFullHermitian(cv::Mat in, cv::Size fullSize)
233 {
234  assert(in.type() == CV_32FC2);
235 
236  cv::Mat out(fullSize, CV_32FC2);
237  for (int i = 0; i < out.rows; ++i)
238  {
239  for (int j = 0; j < out.cols; ++j)
240  {
241  cv::Vec2f p;
242  if (j < in.cols)
243  {
244  p = in.at<cv::Vec2f>(i, j);
245  }
246  else
247  {
248  p = in.at<cv::Vec2f>((out.rows - i) % out.rows, (out.cols - j) % out.cols);
249  p[1] = -p[1];
250  }
251  out.at<cv::Vec2f>(i, j) = p;
252  }
253  }
254 
255  return out;
256 }
257 
258 cv::Mat InplaceFFTShift(cv::Mat mag)
259 {
260  // Rearrange the quadrants of the fourier spectrum
261  // so that the origin is at the image center.
262 
263  // Create a ROI for each 4 quadrants.
264  int cx = mag.cols / 2;
265  int cy = mag.rows / 2;
266  cv::Mat qTL(mag, cv::Rect(0, 0, cx, cy)); // top-left
267  cv::Mat qTR(mag, cv::Rect(cx, 0, cx, cy)); // top-right
268  cv::Mat qBL(mag, cv::Rect(0, cy, cx, cy)); // bottom-left
269  cv::Mat qBR(mag, cv::Rect(cx, cy, cx, cy)); // bottom-right
270 
271  // swap top-left with bottom-right quadrants
272  cv::Mat tmp;
273  qTL.copyTo(tmp);
274  qBR.copyTo(qTL);
275  tmp.copyTo(qBR);
276 
277  // swap top-right with bottom-left quadrants
278  qTR.copyTo(tmp);
279  qBL.copyTo(qTR);
280  tmp.copyTo(qBL);
281 
282  return mag;
283 }
声明处理图像格式转换的函数。
声明实现快速傅里叶变换算法及其逆变换的函数。
#define VPI_IMAGE_FORMAT_2F32
单平面,带有两个交错的 32 位浮点通道。
Definition: ImageFormat.h:142
#define VPI_IMAGE_FORMAT_F32
单平面,带有一个 32 位浮点通道。
Definition: ImageFormat.h:136
用于处理 VPI 图像的函数和结构体。
用于处理 OpenCV 与 VPI 的互操作性的函数。
VPI 状态码处理函数的声明。
声明处理 VPI 流的函数。
VPIStatus vpiSubmitConvertImageFormat(VPIStream stream, uint64_t backend, VPIImage input, VPIImage output, const VPIConvertImageFormatParams *params)
将图像内容转换为所需的格式,可选择缩放和偏移。
VPIStatus vpiCreateFFT(uint64_t backends, int32_t inputWidth, int32_t inputHeight, const VPIImageFormat inFormat, const VPIImageFormat outFormat, VPIPayload *payload)
为直接快速傅里叶变换算法创建负载。
VPIStatus vpiSubmitFFT(VPIStream stream, uint64_t backend, VPIPayload payload, VPIImage input, VPIImage output, uint64_t flags)
对单张图像运行直接快速傅里叶变换。
VPIImageBuffer buffer
存储图像内容。
Definition: Image.h:241
VPIImagePlanePitchLinear planes[VPI_MAX_PLANE_COUNT]
以 pitch-linear 布局存储的所有图像平面的数据。
Definition: Image.h:160
VPIImageBufferPitchLinear pitch
以 pitch-linear 布局存储的图像。
Definition: Image.h:210
void * data
指向此平面的第一行。
Definition: Image.h:141
VPIImageFormat format
图像格式。
Definition: Image.h:152
VPIImageBufferType bufferType
图像缓冲区类型。
Definition: Image.h:238
int32_t height
此平面的高度(像素)。
Definition: Image.h:123
int32_t width
此平面的宽度(像素)。
Definition: Image.h:119
int32_t pitchBytes
一行开头与前一行开头之间的字节差。
Definition: Image.h:134
void vpiImageDestroy(VPIImage img)
销毁一个图像实例。
struct VPIImageImpl * VPIImage
图像的句柄。
Definition: Types.h:256
VPIStatus vpiImageLockData(VPIImage img, VPILockMode mode, VPIImageBufferType bufType, VPIImageData *data)
获取图像对象的锁并返回图像内容。
VPIStatus vpiImageCreate(int32_t width, int32_t height, VPIImageFormat fmt, uint64_t flags, VPIImage *img)
使用指定的标志创建一个空的图像实例。
VPIStatus vpiImageUnlock(VPIImage img)
释放图像对象的锁。
@ VPI_IMAGE_BUFFER_HOST_PITCH_LINEAR
主机可访问,平面以 pitch-linear 内存布局排列。
Definition: Image.h:172
存储图像平面内容。
Definition: Image.h:150
存储关于图像特征和内容的信息。
Definition: Image.h:234
VPIStatus vpiImageCreateWrapperOpenCVMat(const cv::Mat &mat, VPIImageFormat fmt, uint64_t flags, VPIImage *img)
使用给定的图像格式将 cv::Mat 封装到 VPIImage 中。
struct VPIPayloadImpl * VPIPayload
算法负载的句柄。
Definition: Types.h:268
struct VPIStreamImpl * VPIStream
流的句柄。
Definition: Types.h:250
VPIStatus vpiStreamSync(VPIStream stream)
阻塞调用线程,直到此流队列中所有提交的命令都完成(队列为空)...
VPIBackend
VPI 后端类型。
Definition: Types.h:91
void vpiStreamDestroy(VPIStream stream)
销毁一个流实例并释放所有硬件资源。
VPIStatus vpiStreamCreate(uint64_t flags, VPIStream *stream)
创建一个流实例。
@ VPI_BACKEND_CUDA
CUDA 后端。
Definition: Types.h:93
@ VPI_BACKEND_CPU
CPU 后端。
Definition: Types.h:92
@ VPI_LOCK_READ
锁定内存仅用于读取。
Definition: Types.h:617