Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 May 9;7(1):1609.
doi: 10.1038/s41598-017-01711-6.

Basic protocols in quantum reinforcement learning with superconducting circuits

Affiliations

Basic protocols in quantum reinforcement learning with superconducting circuits

Lucas Lamata. Sci Rep. .

Abstract

Superconducting circuit technologies have recently achieved quantum protocols involving closed feedback loops. Quantum artificial intelligence and quantum machine learning are emerging fields inside quantum technologies which may enable quantum devices to acquire information from the outer world and improve themselves via a learning process. Here we propose the implementation of basic protocols in quantum reinforcement learning, with superconducting circuits employing feedback- loop control. We introduce diverse scenarios for proof-of-principle experiments with state-of-the-art superconducting circuit technologies and analyze their feasibility in presence of imperfections. The field of quantum artificial intelligence implemented with superconducting circuits paves the way for enhanced quantum control and quantum computation protocols.

PubMed Disclaimer

Conflict of interest statement

The author declares that they have no competing interests.

Figures

Figure 1
Figure 1
Scheme of reinforcement learning. In each learning cycle, an Agent, denoted by S, interacts with an Environment, denoted by E, realizing some Action (A) on it, as well as gathering information, or Percept (P) about its relation to it. Subsequently, the information obtained is employed in order to decide a strategy on how to optimize the agent, based on a Reward Criterion, whose aim may be to maximize a Learning Fidelity. Afterwards, a new cycle begins. The situation in the quantum realm is similar, and can oscillate between having a quantum version of agent, of environment, or of both of them, as well as interactions between them that can be quantum and/or classical channels with feedforward.
Figure 2
Figure 2
Quantum reinforcement learning for one qubit. We depict the circuit representation of the proposed learning protocol. S, E and R denote the agent, environment and register qubits, respectively. CNOT gates between E and R as well as between S and R are depicted with the standard notation. M is a measurement in the computational basis chosen, while U S and U R are local operations on agent and register, respectively, conditional on the measurement outcomes via classical feedback loop. The double lines denote classical information being fedforward. The protocol can be iterated upon changes in the environment.
Figure 3
Figure 3
Quantum reinforcement learning for multiqubit system, I. We depict the circuit representation of the proposed learning protocol. S, E and R denote the agent, environment and register two-qubit states, respectively. CNOT gates between the respective pairs of qubits in E and R as well as in S and R are depicted with the standard notation. M 1 is a measurement in the computational basis chosen on the first qubit of the register, M 2 is a measurement in the computational basis chosen on the two-qubit state of the register, while U S and U R are local operations on agent and register, respectively, conditional on the measurement outcomes via a classical feedback loop. The double narrow lines denote classical information being fedforward, while the horizontal, double wider lines denote two-qubit states. The protocol can be iterated upon changes in the environment via reset of the agent.
Figure 4
Figure 4
Quantum reinforcement learning for multiqubit system, II. We depict the circuit representation of the proposed learning protocol. S, E and R denote the agent, environment and register two-qubit states, respectively. CNOT gates between the respective pairs of qubits in E and R as well as in S and R are depicted with the standard notation. In this case, the measurement M acts on both register qubits, and is performed in the computational basis chosen. U S and U R are local operations on agent and register, respectively, conditional on the measurement outcomes via a classical feedback loop. The double narrow lines denote classical information being fedforward, while the horizontal, double wider lines denote two-qubit states. The protocol can be iterated upon changes in the environment via reset of the agent.
Figure 5
Figure 5
Scheme of the proposed implementation. In the most complex example proposed, we consider 6 superconducting qubits inside a 3D cavity, distributed in two rows along the cavity axis (another possible configuration would be with two 3-qubit columns perpendicular to the cavity axis). Amp denotes the amplification process, while C represents the controller device, and U is a local operation on the qubits conditional on the classical feedback loop.

Similar articles

Cited by

References

    1. Russell, S. & Norvig, P. Artificial Intelligence: A Modern Approach 3rd. ed. (Pearson, New Jersey, 2010).
    1. Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press, Cambridge, MA, 1998).
    1. Nielsen, M. A. & Chuang, I. L. Quantum Computation and Quantum Information (Cambridge University Press, Cambridge, UK, 2000).
    1. Schuld M, Sinayskiy I, Petruccione F. An introduction to quantum machine learning. Contemp. Phys. 2015;56:172–185. doi: 10.1080/00107514.2014.964942. - DOI
    1. Biamonte, J. et al. Quantum Machine Learning. arXiv:1611.09347 (2016).

Publication types

LinkOut - more resources