概述
使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
近几年,一种可以插入神经网络架构的新型可微分图形层开始兴起。从空间变换器到可微分图形渲染器,这些新型图形层利用多年的计算机视觉和图形学研究知识来构建更高效的新网络架构。将几何先验和约束显式建模到神经网络中,为能够以自监督的方式进行稳健、高效训练的架构打开了大门。
从高级层面来说,计算机图形流水线需要 3D 物体的表示以及它们在场景中的绝对位置、其材质的描述、灯光和相机。随后,渲染器利用此场景描述生成合成渲染。
相比之下,计算机视觉系统从图像开始,尝试推断场景的参数。这样就可以预测场景中有哪些物体、它们的材质以及三维位置和方向。
训练能够解决这些复杂 3D 视觉任务的机器学习系统通常需要大量数据。由于标注数据的过程既昂贵又复杂,因此设计能够理解三维世界且训练时无需太多监督的机器学习模型至关重要。结合计算机视觉和计算机图形学技术后,我们得以利用大量可用的未标注数据。如下图所示,这可以通过合成分析来实现:视觉系统提取场景参数,图形系统基于这些参数渲染图像。如果渲染结果与原始图像匹配,则说明视觉系统准确地提取了场景参数。在这种设置中,计算机视觉和计算机图形学相结合,形成了一个类似于自编码器的机器学习系统,这种系统能够以自监督的方式进行训练。
开发 Tensorflow Graphics 的目的是帮助解决这类挑战,为此,它提供了一组可微分图形和几何层(例如相机、反射模型、空间变换、网格卷积)以及 3D 查看器功能(例如 3D TensorBoard),可用于训练和调试您选择的机器学习模型。
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2022-06-07。
[null,null,["最后更新时间 (UTC):2022-06-07。"],[],[],null,["# Overview\n\n\u003cbr /\u003e\n\nThe last few years have seen a rise in novel differentiable graphics layers\nwhich can be inserted in neural network architectures. From spatial transformers\nto differentiable graphics renderers, these new layers leverage the knowledge\nacquired over years of computer vision and graphics research to build new and\nmore efficient network architectures. Explicitly modeling geometric priors and\nconstraints into neural networks opens up the door to architectures that can be\ntrained robustly, efficiently, and more importantly, in a self-supervised\nfashion.\n\nAt a high level, a computer graphics pipeline requires a representation of 3D\nobjects and their absolute positioning in the scene, a description of the\nmaterial they are made of, lights and a camera. This scene description is then\ninterpreted by a renderer to generate a synthetic rendering. \n\nIn comparison, a computer vision system would start from an image and try to\ninfer the parameters of the scene. This allows the prediction of which objects\nare in the scene, what materials they are made of, and the three-dimensional\nposition and orientation. \n\nTraining machine learning systems capable of solving these complex 3D vision\ntasks most often requires large quantities of data. As labelling data is a\ncostly and complex process, it is important to have mechanisms to design machine\nlearning models that can comprehend the three dimensional world while being\ntrained without much supervision. Combining computer vision and computer\ngraphics techniques provides a unique opportunity to leverage the vast amounts\nof readily available unlabelled data. As illustrated in the image below,\nthis can, for instance, be achieved using analysis by synthesis where the vision\nsystem extracts the scene parameters and the graphics system\nrenders back an image based on them. If the rendering matches the original\nimage, the vision system has accurately extracted the scene parameters. In this\nsetup, computer vision and computer graphics go hand in hand, forming a single\nmachine learning system similar to an autoencoder, which can be trained in a\nself-supervised manner. \n\nTensorflow Graphics is being developed to help tackle these types of challenges\nand to do so, it provides a set of differentiable graphics and geometry layers\n(e.g. cameras, reflectance models, spatial transformations, mesh convolutions)\nand 3D viewer functionalities (e.g. 3D TensorBoard) that can be used to train\nand debug your machine learning models of choice."]]