QUANTUM PHYSICS OF SEMICONDUCTOR MATERIALS AND DEVICES
Quantum Physics of Semiconductor Materials and Devices D. Jena Cornell University
Great Clarendon Street, Oxford, OX2 6DP, United Kingdom Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide. Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries © Debdeep Jena 2022 The moral rights of the author have been asserted Impression: 1 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, by licence or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this work in any other form and you must impose this same condition on any acquirer Published in the United States of America by Oxford University Press 198 Madison Avenue, New York, NY 10016, United States of America British Library Cataloguing in Publication Data Data available Library of Congress Control Number: 2021952780 ISBN 978–0–19–885684–9 (hbk) ISBN 978–0–19–885685–6 (pbk) DOI: 10.1093/oso/9780198856849.001.0001 Printed and bound by CPI Group (UK) Ltd, Croydon, CR0 4YY Links to third party websites are provided by Oxford in good faith and for information only. Oxford disclaims any responsibility for the materials contained in any third party website referenced in this work.
”I am one of those who think like Nobel, that humanity will draw more good than evil from new discoveries.”–Marie Curie
”No problem can be solved from the same level of consciousness that created it. We must learn to view the world anew.”–Albert Einstein
”One shouldnt work on semiconductors, that is a filthy mess; who knows whether any semiconductors exist”–Wolfgang Pauli
”I am thinking of something much more important than bombs. I am thinking about computers”–John von Neumann
”It is frequently said that having a more-or-less specific practical goal in mind will degrade the quality of research. I do not believe that this is necessarily the case and to make my point in this lecture I have chosen my examples of the new physics of semiconductors from research projects which were very definitely motivated by practical considerations...”–William Shockley, Nobel Lecture (1956)
”Some people can do one thing magnificently, like Michelangelo. Others make things like semiconductors, or build 747 airplanes - that type of work requires legions of people. In order to do things well, that can’t be done by one person, you must find extraordinary people”–Steve Jobs
Preface Semiconductor electronics requires for its foundation primarily wave mechanics and statistics. However, crystallography, thermodynamics, and chemistry also have a share in it and, quite generally, “it is incredible what miserable quantities of thought and mathematics are needed to provide even the simplest tools for daily use in semiconductor physics”.–Eberhard Spenke and Walter Schottky Several excellent books and monographs have been written about the physics of semiconductors and nanostructures. I could not resist reproducing the above paragraph from an early classic, Spenke’s Electronic Semiconductors written in 1958. Till today, each author of books on this subject struggles with the same pedagogical challenge that pioneers such as Spenke and Shockley faced in writing the first books on this topic in the 1950s, when the field was in its infancy. Consider the simplest physical processes that occur in semiconductors: electron or hole transport in bands and over barriers, collision of electrons with the atoms in the crystal, or when electrons and holes annihilate each other to produce a photon. The correct explanation of these processes require a quantum mechanical treatment. Any shortcuts lead to misconceptions that can take years to dispel, and sometimes become roadblocks towards a deeper understanding and appreciation of the richness of the subject. A typical introductory course on semiconductor physics would then require prerequisites of quantum mechanics, statistical physics and thermodynamics, materials science, and electromagnetism. Rarely would a student have all this background when (s)he takes a course of this nature in most universities. What has changed since 1950s? Semiconductor devices have become indispensable, and integral in our daily lives. The shift towards a semiconductor electronics and photonics-powered information economy occurred near the turn of the century. This was not the case when the early books on semiconductor physics were written. The connection of the physics to today’s information systems such as transistors for logic, memory, and signal amplification, and light-emitting diodes and lasers for lighting and communications makes the subject far more tangible and alive than an abstract one. Practitioners of the science, technology, and art of semiconductor physics and devices are the ”quantum mechanics” of our age in the true sense of the word. They reside in several leading industries, research laboratories, and universities, and are changing the world, one bit (or photon!) at a time.
viii Preface
The quantum physics of semiconductors is not abstract, but measured routinely as currents and voltages in diodes and transistors, and seen as optical spectra of semiconductor lasers. The glow of semiconductor quantum well light emitting diodes in our rooms and cell phone screens puts the power and utility of understanding the quantum physics of semiconductors and nanostructures on display right in front of our very eyes. Asher Peres captured this philosophy beautifully in his quote: ”Quantum Phenomena do not occur in a Hilbert space. They occur in a laboratory.”–Asher Peres This philosophy accurately reflects the approach I have taken in writing this book. Semiconductor physics is a laboratory to learn and discover the concepts of quantum mechanics and thermodynamics, condensed matter physics, and materials science, and the payoffs are almost immediate in the form of useful semiconductor devices. I have had the opportunity to work on both sides of the fence – on the fundamental materials science and quantum physics of semiconductors, and in their applications in semiconductor electronic and photonic devices. Drawing from this experience, I have made an effort in this book to make each topic as tangible as possible. The concepts are developed from their experimental roots, with historical trails, personalities, and stories where possible, to reflect the subject as a human adventure. The mathematical structure is then developed to explain experimental observations, and then predict new phenomena and devices. I believe this is a unique approach towards a book on this subject, one that distinguishes it from others in the field. The book is aimed at third and fourth year undergraduate students, and graduate students in Electrical Engineering, Materials Sciences and Engineering, Applied Physics, Physics, and Mechanical and Chemical Engineering departments that offer semiconductor related courses. It will be of interest to scientists and engineers in the industry, and in various research laboratories. The book is divided into four modules. Module I, in seven chapters, presents the fundamentals rigorously, covering the core principles of quantum mechanics, statistical thermodynamics, and the physics of free electrons. The last two chapters of Module I develop perturbation theory techniques without which one is often disadvantaged in understanding semiconductor physics. Module II, in nine chapters introduces the concepts of bands, gaps, effective masses, and Bloch theory, develops a few methods to calculate and understand semiconductor bandstructures, and develops methods to handle a range of semiconductor quantum heterostructures. Module III starts with the quantum physics of diodes and transistors in the ballistic limit. It then covers several electronic phenomena on transport and scattering using the Boltzmann transport equation, and Fermi Golden rule for transitions. High-field transport, tunneling, and quantum magnetotransport phenomena round off this module of nine chapters.
ix
Module IV focuses on semiconductor photonics, by starting from a description of the Maxwell equations and light, tracking the interaction of photons with semiconductors, and culminating in the description of semiconductor heterostructure photonic devices. The chapter-end exercises have been tried and tested as homework assignments in classes. They considerably amplify the material discussed in the chapters, and are designed to encourage deep thinking, purposeful enquiry, and thoughtful discussions. Some problems take the reader beyond the topics of the respective chapters, into current areas of research. Some problems connect to other fields such as biology, astronomy, high-energy physics, and other fields in which semiconductors play an increasingly important role. There is no better practice to hone one’s skills to achieve mastery over the subject than to solve as many exercises as time permits. Instructors must plan the usage of the book to fit their goals. The book has far more material than can (or should) be covered in a 1semester course. A typical 1-semester course at Cornell offered for senior undergraduates and beginning graduate students covers all of Module I, most chapters of Module II, (Chapters 12 and 13 are assigned as projects), Chapters 20-24 in Module III, and Chapters 27 and 29 of Module IV. To cater to the varying backgrounds of enrolled students, more time is spent on Modules I and II. Once the students are comfortable with the concepts of bands and carrier statistics in semiconductors of Modules I and II, the progress through Modules III and IV can be rapid. I have provided a table and guidance for instructors and students later in this preface for potential usage of the book. No claim to originality is made for most of the presented material. Nevertheless, the process of writing for pedagogical purposes allows for fresh perspectives. Readers may encounter a few unconventional derivations, or connections made that were not apparent. Because much of semiconductor physics originated from atomic quantum theory, such examples abound, but have been ”lost in translation” over the past few decades. I have brought them back, as semiconductors are going back to their atomic roots in the new generation of nanostructured devices. The field is alive and kicking, new semiconductors and phenomena are being discovered, synthesized, and are being used for applications today. The presentation of the materials in the book take these advances into fold. If the book gives ideas, or makes connections for readers that enable them to make new discoveries or inventions that outdate the topics discussed here, it will have exceeded its intended pedagogical purpose. Colleagues and students I have talked to around the world agree that there is a place for a stand-alone book to introduce solid state physics to electrical engineers, applied physicists, and materials scientists who work on semiconductors, by using semiconductor devices in the backdrop. It is my sincere hope that this book fills that void. Debdeep Jena Ithaca, New York, March 2022.
26 27 28 29 30
17 18 19 20 21 22 23 24 25
8 9 10 11 12 13 14 15 16
Chapter 1 2 3 4 5 6 7
Total meetings
Module I: Fundamentals And off we go! Secrets of the Classical Electron Quantum Mechanics in a Nutshell Damned Lies, and Statistics Electrons in the Quantum World Red or Blue Pill: Befriending the Matrix Perturbations to the Electron’s Freedom Module II: Bands, Doping, and Heterostructures Electrons in a Crystal get their Bands, Gaps and Masses Bloch Theorem, Bandstructure, and Quantum Currents Crystal Clear: Bandstructure of the Empty Lattice Tight-Binding Bandstructure k.p Bandstructure 1,2,3, ∞: Pseudopotentials and Exact Bandstructure Doping and Heterostructures: The Effective Mass Method Carrier Statistics and Energy Band Diagrams Controlling Electron Traffic in the k-Space Module III: Quantum Electronics with Semiconductors Game of Modes: Quantized R, L, and C Junction Magic: Schottky, pn and Bipolar Transistors Zeroes and Ones: The Ballistic Transistor Fermi’s Golden Rule No Turning Back: The Boltzmann Transport Equation Taking the Heat: Phonons and Electron-Phonon Interactions Scattering, Mobility and Velocity Saturation Through the Barrier: Tunneling and Avalanches Running Circles: Quantum Magnetotransport Module IV: Quantum Photonics with Semiconductors Let there be Light: Maxwell Equations Light-Matter Interaction Heavenly Light: Solar Cells and Photodetectors Reach for the stars: Semiconductor Lasers and LEDs Every End is a New Beginning
• • • •
•
•
• • • • • • • • •
• • • • • • • • • • • • • • • • • •
• • • • • • •
Photonics track
• • • • • • •
Electronics track
42
28
0.5 1 0.5 0.75 Assigned Reading
Assigned Reading 0.75 0.75 0.5 1.5 0.75 0.75 1 0.5
Assigned Reading 1 1 1 2 1 1 1 1 1 2 1 1 Assigned Reading
2 1.5 0.75 0.75 Assigned Reading Assigned Reading 1.5 1.5 3
Assigned Reading 0.75 2 1 3 0.5 0.5
Assigned Reading 1 3 1 4 1 1 4 2 1 1 Assigned Reading Assigned Reading 2 3 4
1-Semester Course (26x75 minutes) No. of Lectures
1-Semester Course (42x50 minutes) No. of Lectures
18
0.5 1 0.5 0.5 Assigned Reading
Skip Assigned Reading Assigned Reading 0.5 1 0.5 0.5 Assigned Reading Skip
2 1 0.5 0.5 Skip Assigned Reading 1 1.5 1.5
Assigned Reading 0.5 1.5 0.5 1.5 0.5 0.5
1-Quarter Course (18x80 minutes) No. of Lectures
28
4 2 1 1 1 2 2 2 3
Assigned Reading 1 3 1 3 1 1
2-Semester Course (2x26x75 minutes) Semester 1
Table 1 Suggested usage of the book for instructors, students, and readers.
28
2 3 2 2 Assigned Reading
2 2 3 1 3 2 2 2 1
1
Semester 2
18
3 2 0.5 1 0.25 0.75 1 2 1
Assigned Reading 1 2 1 1.5 0.5 0.5
2-Quarter Course (2x18x80 minutes) Quarter 1
18
1 2 1 1 Assigned Reading
1 1.5 1.5 0.5 2 1 1.5 1 1
2
Quarter 2
x Preface
xi
For Instructors Thank you for considering this book for your teaching. A typical semester-long course that uses this book will cover Module I and Module II as the basics of semiconductor physics. The remaining part of the course can take two tracks based on the interests of the instructor and the students: one that focuses on electron transport and electronics, and the other that focuses on light-matter interaction and photonics. For courses choosing to discuss electron transport and electronics, the sequence [Module I] → [Module II] → [Module III] is suggested. For courses that choose light-matter interactions and photonics, the sequence [Module I] → [Module II] → [Chapter 20 (on Fermi’s golden rule) and Chapter 22 (on Phonons) from Module III] → [Module IV] is suggested. Though not absolutely essential, using curated software illustrations, and incorporating them into assignments considerably amplifies the effectiveness of learning of the contents of this subject. I have used Mathematica for illustrations in class. Almost all the quantitative figures and plots that appear in the book are calculated and plotted in Mathematica and touched up for the book. The book publisher and I have planned to provide a Mathematica file, a set of slides for classroom illustration, and selected solutions of chapter-end Exercises in the near future to instructors who wish to use this book. Students taking my class from the contents of this book have used their favorite software tools not limited to Mathematica (e.g., Matlab, Python, C, or others) for some assignment problems. The book has much more material that can be covered in depth in one semester or one quarter. The tracks suggested earlier are guidelines. In addition, Table 1 indicates a few possible course schedules to cover the materials in this book. Depending on the needs of the course and the schedule (semester vs. quarter), the number of suggested classes per chapter are listed in this Table. Again, I am very grateful that you are considering this book. I sincerely request feedback and suggestions as you use this text. They will improve and sharpen the pedagogical impact of this book. For Students ”Truth is like a vast tree, which yields more and more fruit, the more you nurture it.”–Mahatma Gandhi Ever since I was an undergraduate student, I have been fascinated by quantum physics. My fascination has only grown with time! As a student I could not imagine that I will be so fortunate to work on this topic for a career. And I certainly did not imagine I will write a book on the topic! A colleague asked me why I was writing this book, when nobody reads books anymore. The best answer to that question is this: I have written it to give you a book that I wish I had as a student: not just the facts, but the exciting back stories, and secrets that I have
xii Preface
learned learned over many years of experience with this subject. The book you are holding in your hands has two purposes. The first is to introduce you to the wonderful world of semiconductors, which combine quantum physics and practical applications in a delightful way. The second is to introduce you to the historical backstories and personalities to show where the discoveries actually came from! I hope you enjoy reading the book. I will be honest that the book came out to be much longer than I originally intended. At some point, I discussed with my editor if it is better to split the contents into two volumes. We decided against it, reasoning that there are a lot of advantages of having all contents in one place. Besides, the ”buy one get one free” deal is a sweet one! Probably the course in which you will use this book will cover Modules I and II, with some parts of Modules III and/or IV. Modules III and IV bring you up to several areas of today’s research activities. I hope you use the book beyond the class(es) in which it is used. I have two points of advice and two requests. The first advice I have for you is that you use a software tool (e.g. Mathematica, Matlab, Python or others of similar capability), and the freely available tool 1D Poisson, throughout the class in which you are prescribed this textbook. The second is that as you learn the subject, you will go deep into a lot of details (which in quantum physics would be dotting the i’s and crossing the h¯ ’s!). As you do so, always step back and ask the question of the ”why” behind the ”what”. This is what the book is about, it tries to answer ”why” things in semiconductors are the way they are. If the ”why” part is not answered adequately, please let me know so that I fix it in the next edition. Now for my requests. As I collected the list of personalities for the historical parts of the book, the lack of diversity became sadly apparent. I have tried my best to provide a fair representation, but it is not enough. There are two ways you can help solve this problem. First, if you know or learn of contributors that should appear in the book but don’t, please let me know. The second, it is my genuine hope that the book motivates students of diverse backgrounds to join and shape the next semiconductor revolutions. Please seriously consider and give the field a try. If you like even a fraction of the contents of this book, you will be pleasantly surprised how good a career it can be, both for personal fulfillment, and for service to humanity! For Self Study An effective way for self study is to pair reading the book with viewing freely available online videos on this subject. It is important to stick to a schedule, preferably a semester long one as indicated in Table 1. Watching and taking notes for 2-3 lectures/week, coupled with a few hours of critical reading of the corresponding chapters in this book with a pencil and notebook, and, importantly, solving a few Exercises at the end of the chapter will lead to the most effective learning.
Acknowledgements Only at the end of the long period of writing this book did I come to accept what I had heard, but never believed: that a writer never completes a book. He or she merely abandons it. It is time for me to hand over this book to the publishers. My cup of acknowledgements and gratitude is spilling over - I need a bucket! I am terrified that inevitably there are names I have missed in the following list. To them, I say: please do not be shy and let me know, and it will be fixed. The Table at the end of this section lists the people who have generously provided their time to read parts of the book, or have provided figures that are included in the text. The readers of this book, and I, will remain forever in gratitude for your selfless service. That you have taken time out of your schedules to do this (in the midst of a global pandemic), says this loud and clear: that you love this subject, and want others to do so too. That you are from all around the world also shows that this love knows no boundaries. The Table at the end of this section does not list the scores of students who have used older versions of the book in class, caught errors, provided feedback and constructive criticism, worked through Exercises, and jumped with joy when they got the bandstructure of silicon to work in class - you deserve the subtle tip of the hat. After fixing the errors you brought to my notice, and those brought to my notice by a very careful copy editor from Oxford University Press, the laws of probability dictate that in a book of this size, several will remain. These are entirely my own, and I will strive to fix them in the future. I cannot thank my editor Adlung Sonke from Oxford University Press enough. Throughout the prolonged process of writing, he provided valuable suggestions and encouragement in just the right quanta. I must have tested his incredible patience by missing my self-imposed deadlines! He gracefully saw the massive project through. I have learned a large part of the materials that appear in this book from my research group members, colleagues, and collaborators. I have had the fortune to work with them over the last two decades, and express my sincere gratitude to them. I thank the financial and intellectual support from funding agencies and program managers for research that made it into several sections of this book. My home departments and the engineering dean’s office at Cornell University have provided support in so many ways - such as picking up the tab for the printing and distribution of the early versions of this manuscript for the entire class. You do not have to do that anymore!
xiv Acknowledgements
The reader of every book experiences shades of the author’s scientific ancestry in the manner and the mix in which the subject is presented. It it difficult to express in words my gratitude to my teachers who shaped my thinking in this field, from childhood, through undergraduate school at IIT Kanpur, to graduate studies at the University at California, Santa Barbara (UCSB). As for the specific contents of this book, I would like to acknowledge a few of my teachers. At IIT Kanpur, Prof. Avinash Singh ignited my interest in quantum mechanics and statistical mechanics, and Prof. Aloke Dutta introduced me to the breadth of semiconductor devices. At UCSB, I want to particularly thank Prof. Umesh Mishra for instilling in me some of his infectious joy of semiconductor devices, and Prof. Herbert Kroemer for showing me that quantum physics is in the DNA of semiconductors, and powers its practical applications. Little did I know while doing research in graduate school that I was ”learning to hunt with the lions”, so to speak. I hope this book, in addition to providing new perspectives, solidifies the timeless ones I learned from my teachers. For most, graduate school provides an education and a career. I am among the most blessed: it also provided a soulmate! Since our graduate school days at UCSB, Huili (Grace) Xing and I have been fellow teachers and scientists at work, and parents at home. We have worked on the field of semiconductors together for the past two decades, and continue to learn so much about the subject from each other and from our joint research group. The final year of the writing of this book demanded nearly all my attention and energy. I will be amiss if I do not say this: that this task truly took a toll on the time I could muster to run an active research group, and far more on the time I could devote to my family at home. Grace has skillfully kept both work and home chugging away and has created spacetime out of nowhere for me to hide in and finish this work. How do you thank that! I would not have wanted to put up with myself socially during the long and unending hours spent in finalizing the last versions of the book. Our son Shaan, who is now 14, and just grew taller than me (over the course of the writing of this book!), has had to handle a Dad at home hiding in the study room in the evenings, emotionally unavailable for days and weeks. My only parenting in the past months has been to place restrictions on his gaming hours! Yet, Grace and he have put up with me in good humor. I hope I do enough now to make up for the lost time. Shaan put his artistic skills to work in the first designs of the book cover. It has morphed into what you see now. Finally, I want to thank Grace’s and my families, and especially our parents. They taught us to dream big and to stay grounded. They take the greatest joy in our successes, and absorb the pains and failures. Grace’s parents have filled the void created by my absence due to the book, by showering our son with their affection. My parents lovingly demanded and acquired from me several earlier drafts of this book. My mother has the pre-final draft, but sadly my father did not live to see that, or the final version. This book is dedicated to his memory.
xv
Table 2 Thank you for your help! Aaron Franklin Alan Seabaugh Amit Verma Andres Godoy Anthony Hoffman Berardi Sensale Rodriguez Changkai Yu Chris van de Walle Clifford Pollock Dan Ritter David Esseni Edward Yu Enrique Marin Eric Pop Eungkyun Kim Evelyn Hu Farhan Rana Greg Muziol Guillaume Cassabois Hari Nair Henryk Turski Huili (Grace) Xing Jamie Phillips Jan Kuzmik Jashan Singhal Jason Petta Jean Pierre Leburton Jimy Encomendero John Simon Jon McCandless Joseph Dill Kevin Lee Kirstin Alberi Len van Deurzen Łukasz Janicki Matteo Meneghini Michael Stroscio Nobuya Mori Phillip Dang Reet Chaudhuri Rongming Chu Samuel Bader Sanjay Krishna Scott Crooker Takuya Maeda Valentin Jmerik Vladimir Strocov Yasuyuki Miyamoto Yifan (Frank) Zhang Yu Cao Zhiting Tian
Duke University University of Notre Dame Indian Institute of Technology, Kanpur University of Granada University of Notre Dame University of Utah Cornell University University of California at Santa Barbara Cornell University Technion, Israel Institute of Technology University of Udine University of Texas at Austin University of Granada Stanford University Cornell University Harvard University Cornell University UNIPRESS, Warsaw University of Montpellier Cornell University UNIPRESS, Warsaw Cornell University University of Delaware Slovak Academy of Sciences Cornell University Princeton University University of Illinois at Urbana Champaign Cornell University National Renewable Energy Laboratory Cornell University Cornell University Cornell University National Renewable Energy Laboratory Cornell University Wroclaw University of Technology University of Padova University of Illinois at Chicago Osaka University Cornell University Cornell University Pennsylvania State University Cornell University The Ohio State University Los Alamos National Laboratory Cornell University Ioffe Institute, St. Petersburg Paul Scherrer Institute Tokyo Institute of Technology Cornell University Qorvo Inc. Cornell University
Contents I
Fundamentals
1
1
And off We Go! 1.1 Beyond belief 1.2 A brief history of semiconductors 1.3 Future 1.4 These boots are made for walking 1.5 Chapter summary section Further reading Exercises
3 3 4 6 7 8 8 9
2
Secrets of the Classical Electron 2.1 Our ancestors knew metals 2.2 Discovery of the electron and its aftermath 2.3 Drude’s model explains Ohm’s law 2.4 Metals are shiny 2.5 Metals conduct heat 2.6 Icing on the cake: The Wiedemann–Franz law 2.7 All is not well 2.8 Chapter summary section Further reading Exercises
11 12 12 12 15 16 17 18 19 20 20
3
Quantum Mechanics in a Nutshell 3.1 Planck’s photon energy quanta 3.2 Bohr’s electron energy quanta 3.3 Wave-particle duality 3.4 The wavefunction 3.5 Operators 3.6 States of definite momentum and location 3.7 States of definite energy: The Schrodinger equation ¨ 3.8 Time-dependent Schrodinger equation ¨ 3.9 Stationary states and time evolution 3.10 Quantum current 3.11 Fermions and bosons 3.12 Fermion and boson statistics 3.13 The Spin-statistics theorem 3.14 The Dirac equation and the birth of particles 3.15 Chapter summary section Further reading
23 23 26 28 29 31 33 35 36 37 39 41 44 46 47 48 49
xviii Contents
Exercises
50
4
Damned Lies and Statistics 4.1 Quantum statistics and entropy 4.2 The physics of equilibrium 4.3 Partition function for quantum systems 4.4 The Fermi–Dirac distribution 4.5 The Bose–Einstein distribution 4.6 Properties of the distribution functions 4.7 Quantum twist on thermodynamics 4.8 Meaning of equilibrium in semiconductor devices 4.9 Chapter summary section Further reading Exercises
59 59 62 64 65 66 66 71 72 76 77 77
5
Electrons in the Quantum World 5.1 In Schrodinger equation we trust ¨ 5.2 The free electron 5.3 Not so free: Particle on a ring 5.4 The electron steps into a higher dimension: 2D 5.5 Electrons in a 3D box 5.6 The particle in a box 5.7 The Dirac delta potential 5.8 The harmonic oscillator 5.9 The hydrogen atom 5.10 Chapter summary section Further reading Exercises
83 84 85 87 98 105 111 113 114 114 115 116 116
6
Red or Blue Pill: Befriending the Matrix 6.1 The expansion principle 6.2 Matrix mechanics 6.3 Matrices and algebraic functions 6.4 Properties of matrix eigenvlaues 6.5 Looking ahead 6.6 Chapter summary section Further reading Exercises
123 124 125 127 131 131 132 132 133
7
Perturbations to the Electron’s Freedom 7.1 Degenerate perturbation theory 7.2 Non-degenerate perturbation theory 7.3 The Brillouin–Wigner perturbation results 7.4 Rayleigh–Schrodinger perturbation results ¨ 7.5 The Hellmann–Feynman theorem 7.6 Perturbation theory example 7.7 Chapter summary section Further reading Exercises
135 136 138 141 142 143 144 147 148 149
Contents xix
II
Bands, Doping, and Heterostructures
153
8
Electrons in a Crystal Get Their Bands, Gaps, and Masses 8.1 The free–electron 8.2 Periodic perturbation 8.3 Bands, gaps, and effective masses 8.4 Non-degenerate perturbation theory 8.5 Glimpses of the Bloch theorem 8.6 Non-periodic potentials and scattering 8.7 Chapter summary section Further reading Exercises
155 156 157 158 165 166 167 169 170 170
9
Bloch Theorem, Bandstructure, and Quantum Currents 9.1 The Bloch theorem 9.2 Bloch theorem: aftermath 9.3 Real and reciprocal lattice, Brillouin zones 9.4 Velocity of Bloch states 9.5 Dynamics of Bloch states 9.6 Bloch wave velocity and ballistic current 9.7 Transport by Bloch waves with scattering 9.8 Energy (heat) current 9.9 Any current 9.10 Quantum Wiedemann–Franz law 9.11 Metals, semiconductors, semimetals and insulators 9.12 Chapter summary section Further reading Exercises
175 176 179 182 187 190 194 195 197 198 199 199 199 201 201
10 Crystal Clear: Bandstructure of the Empty Lattice 10.1 Diffraction as a sharp eye 10.2 Bragg diffraction condition 10.3 Broken symmetries and physical laws 10.4 Bravais lattices 10.5 Nearly free–electron bandstructure 10.6 Chapter summary section Further reading Exercises
205 205 207 208 209 211 215 215 215
11 Tight-Binding Bandstructure 11.1 Atoms, bonds, and molecules 11.2 Bandstructure of 1D, 2D, and 3D crystals 11.3 1D, 2D: nanotubes, graphene, BN, MX2 11.4 3D FCC: Si, GaAs 11.5 3D wurtzite: GaN, AlN, ZnO 11.6 Tight-binding to design new properties 11.7 Chapter summary section Further reading
221 222 225 234 238 243 244 245 246
xx Contents
Exercises
246
12 k · p Bandstructure 12.1 k · p theory 12.2 Symmetry 12.3 Analytical model without spin 12.4 Non-parbolicity and sum rules 12.5 The Kane model with spin-orbit interaction 12.6 Chapter summary section Further reading Exercises
251 251 254 256 259 260 266 266 267
13 1, 2, 3 ... ∞: Pseudopotentials and Exact Bandstructure 13.1 The empire strikes back 13.2 Exact bandstructure of the Dirac comb potential 13.3 Tight-binding models emerge from Kronig–Penney 13.4 Point defects in Kronig–Penney models 13.5 Green’s functions from Kronig–Penney models 13.6 Pseudopotentials: what they are and why they work 13.7 Bandstructure of Si, Ge, and GaAs 13.8 Bandstructure of AlN, GaN, and InN 13.9 Pseudopotentials to DFT and beyond 13.10Chapter summary section Further reading Exercises
269 269 270 274 275 277 278 281 283 285 286 287 287
14 Doping and Heterostructures: The Effective Mass Method 14.1 Effective mass approximation, envelope functions 14.2 3D, 2D, 1D, 0D: heterostructures 14.3 3D bulk bandstructure 14.4 Doped semiconductors 14.5 2D quantum wells 14.6 1D quantum wires 14.7 0D quantum dots 14.8 Finite barrier heights 14.9 Multilayers and superlattices 14.10Wannier functions 14.11Chapter summary section Further reading Exercises
291 292 295 296 298 299 301 302 304 304 306 307 308 308
15 Carrier Statistics and Energy Band Diagrams 15.1 Carrier statistics 15.2 EF is constant at thermal equilibrium 15.3 Metal-semiconductor Schottky junctions 15.4 p-n homojunctions 15.5 Heterojunctions 15.6 Energy band diagrams: Poisson+Schrodinger ¨ 15.7 Polarization-induced doping in heterostructures
311 311 320 320 323 325 327 330
Contents xxi
15.8 Chapter summary section Further reading Exercises 16 Controlling Electron Traffic in the k-Space 16.1 Electron energies in semiconductors 16.2 Semiconductor statistics 16.3 Ballistic transport in semiconductors 16.4 Ballistic transport in non-uniform potentials/tunneling 16.5 Scattering of electrons by phonons, defects and photons 16.6 The Boltzmann transport equation 16.7 Current flow with scattering: drift and diffusion 16.8 Explicit calculations of scattering rates and mobility 16.9 Semiconductor electron energies for photonics 16.10The optical joint density of states ρ J (ν) 16.11Occupation of electron states for photonics 16.12Absorption, and emission: spontaneous and stimulated 16.13Chapter summary section Further reading Exercises
III
Quantum Electronics with Semiconductors
331 332 333 337 338 340 342 345 347 351 354 358 361 363 366 367 371 371 371
373
17 Game of Modes: Quantized R, L, and C 17.1 Classical R, L, C circuits 17.2 Quantized conductance 17.3 Quantum capacitance 17.4 Kinetic inductance 17.5 Quantum R, L, C circuits 17.6 Negative R, L, and C 17.7 Chapter summary section Further reading Exercises
375 376 377 382 388 392 393 397 397 397
18 Junction Magic: Schottky, pn and Bipolar Transistors 18.1 Ballistic Schottky diodes 18.2 pn diodes: discovery 18.3 pn diodes: transport 18.4 Bipolar junction transistors 18.5 Deathniums! 18.6 Chapter summary section Further reading Exercises
401 401 408 411 417 423 424 426 426
19 Zeroes and Ones: The Ballistic Transistor 19.1 The MOS capacitor 19.2 The ballistic FET 19.3 Ballistic I-V characteristics
429 430 433 436
xxii Contents
19.4 Quantum wire ballistic FET 19.5 The drift-diffusion FET 19.6 CMOS and HEMTs 19.7 Source/drain ohmic contacts 19.8 A brief history of FETs 19.9 Chapter summary section Further reading Exercises
441 443 445 451 452 454 455 455
20 Fermi’s Golden Rule 20.1 Fermi’s golden rule 20.2 Oscillating perturbations 20.3 Transitions to continuum 20.4 Kubo–Greenwood formula 20.5 Decoherence in qubits 20.6 Electron-electron scattering 20.7 Dyson series and diagrams 20.8 Zero-sum game: self energy 20.9 Chapter summary section Further reading Exercises
461 461 465 467 468 470 473 475 476 477 477 477
21 No Turning Back: The Boltzmann Transport Equation 21.1 Micro vs. macro 21.2 The Liouville theorem 21.3 Boltzmann transport equation 21.4 H-theorem and entropy 21.5 Equilibrium distribution 21.6 The RTA: time to relax! 21.7 One formula to rule them all 21.8 Electrical conductivity 21.9 Thermoelectric properties 21.10Onsager relations 21.11Conservation laws 21.12Berry curvature correction 21.13Limitations of the BTE 21.14Chapter summary section Further reading Exercises
481 481 483 487 490 493 499 501 506 511 517 518 520 521 523 523 524
22 Taking the Heat: Phonons and Electron-Phonon Interactions 22.1 Phonon effects: A r´esum´e 22.2 Phonon dispersions and DOS 22.3 Optical conductivity 22.4 Lyddane–Sachs–Teller equation 22.5 Acoustic wave devices 22.6 Thermal conductivity 22.7 Phonon number quantization
531 531 533 538 540 541 542 543
Contents xxiii
22.8 Electron-phonon interaction 22.9 Chapter summary section Further reading Exercises
544 550 551 551
23 Scattering, Mobility, and Velocity Saturation 23.1 Electron mobility: a r´esum´e 23.2 Scattering mechanisms 23.3 Point defect scattering 23.4 Coulomb impurity scattering 23.5 Dipole scattering 23.6 Dislocation scattering 23.7 Alloy disorder scattering 23.8 Interface scattering 23.9 Phonon scattering 23.10Experimental mobilities 23.11High-field velocity saturation 23.12Chapter summary section Further reading Exercises
555 555 557 574 577 580 581 583 586 588 602 604 607 607 608
24 Through the Barrier: Tunneling and Avalanches 24.1 Tunneling: a r´esum´e 24.2 Single-barrier tunneling 24.3 WKB tunneling theory 24.4 WKB for semiconductors 24.5 Nordheim supply function 24.6 Fowler–Nordheim tunneling 24.7 Interband Zener tunneling 24.8 pn tunnel junctions in 1D, 2D, and 3D 24.9 NDR, backward diodes 24.10Tunneling FETs 24.11Resonant tunneling 24.12Bardeen’s tunneling theory 24.13Kubo formalism 24.14Landau–Zener theory 24.15Avalanche processes and impact ionization 24.16Tail of the dragon 24.17Chapter summary section Further reading Exercises
611 611 613 615 617 621 621 624 624 629 633 634 635 636 637 639 642 642 643 643
25 Running Circles: Quantum Magnetotransport 25.1 Magnetotransport: a r´esum´e 25.2 Hall effect 25.3 Magnetoresistance 25.4 Nernst and Ettingshausen effects 25.5 Cyclotron resonance
645 645 648 656 659 660
xxiv Contents
25.6 Faraday rotation 25.7 Atoms: Bohr magneton and spin 25.8 Landau levels in solids 25.9 Shubnikov de Haas effect 25.10The quantum Hall effect 25.11Quantum Hall effect theories 25.12Hierarchy of Hall effects 25.13Chapter summary section Further reading Exercises
IV
Quantum Photonics with Semiconductors
663 664 668 672 679 681 692 695 696 697
701
26 Let There Be Light: Maxwell Equations 26.1 Maxwell equations in vacuum 26.2 Light from Maxwell equations 26.3 Maxwell equations in (k, ω ) space 26.4 Maxwell equations in materials 26.5 Classical light-matter interaction 26.6 Kramers–Kronig relations 26.7 Accelerating charges radiate 26.8 Need for quantum theory of light 26.9 Chapter summary section Further reading Exercises
703 704 705 707 709 711 712 713 713 714 714 715
27 Light–Matter Interaction 27.1 Photonic effects: a r´esum´e 27.2 Electron-photon matrix elements 27.3 Absorption spectra of semiconductors 27.4 Number of photons in light 27.5 Photon absorption rate 27.6 Equilibrium absorption coefficient 27.7 Quantum wells, wires, and dots 27.8 Critical points 27.9 Forbidden and indirect absorption 27.10Exciton absorption 27.11Franz–Keldysh effect 27.12Intersubband absorption 27.13Free carrier, and impurity absorption 27.14Photoelectron spectroscopy 27.15Chapter summary section Further reading Exercises
717 717 719 722 723 724 729 731 734 738 740 742 744 745 747 747 748 748
28 Heavenly Light: Solar Cells and Photodetectors 28.1 Solar sells and photodetectors: a r´esum´e 28.2 Solar cells
753 753 754
Contents xxv
28.3 Shockley–Ramo theorem 28.4 Photodetectors 28.5 Avalanche photodiodes 28.6 Quantum well infrared photodetectors 28.7 Electro-absorption modulators 28.8 Solar blind photodetectors 28.9 Chapter summary section Further reading Exercises
763 763 766 766 767 767 768 768 768
29 Reach for the Stars: Lasers and LEDs 29.1 Lasers and LEDs: a r´esum´e 29.2 Einstein’s A and B coefficients 29.3 Semiconductor emission 29.4 Entropy of optical transitions 29.5 Gain and emission in bands 29.6 Spontaneous emission: LEDs 29.7 Stimulated emission: lasers 29.8 Double heterostructure lasers 29.9 Laser rate equations 29.10Case study: blue laser diode 29.11DFB, VCSELs and QCLs 29.12Towards field quantization 29.13Broadband field modes 29.14Quantization of fields 29.15Field quantization: aftermath 29.16Quantized light-matter interaction 29.17Fundamental optical processes 29.18Einstein’s re-derivation of Planck’s law: back to 12 ! 29.19Chapter summary section Further reading Exercises
773 773 776 778 779 781 785 787 789 792 795 796 798 803 807 810 814 819 821 823 824 825
30 Every End is a New Beginning 30.1 Smallest, fastest, brightest 30.2 Looking back and forward 30.3 Ferroelectric semiconductors 30.4 Ferromagnetic semiconductors 30.5 Multiferroic semiconductors 30.6 Superconducting semiconductors 30.7 Semiconductors for quantum communications 30.8 Semiconductors for quantum computation 30.9 Semiconductors for energy 30.10Semiconductors for healthcare and agriculture 30.11Semiconductors for space exploration 30.12Social impact of semiconductors 30.13Chapter summary section Further reading
829 829 833 834 835 836 836 837 840 842 843 844 844 845 846
xxvi Contents
Exercises
846
A Appendix A.1 What is in the appendix? A.2 Semiconductor Formulae A.3 Physical properties of 3D and 2D semiconductors A.4 References for the appendix A.5 Physical constants
847 847 847 847 852 853
References
855
Index
863
Part I
Fundamentals
And off We Go! In this chapter, we get acquainted with the reason, the structure, and the best way to use the book. Each chapter will begin with a boxed summary of this form:
• Why this book? • What is in this book? • How can you make the best use of this book? The summary is aimed to start out with a well-defined set of questions that are answered in the chapter.
1.1 Beyond belief How did the smallest and the lightest fundamental particle in nature with mass –the electron– enable me to type these words on a computer? How did the electron help light up the screen of my laptop? How did it help remember all of this text as the battery of my laptop ran out? And how did it help kick those photons that power emails, and download this file and read these words? Explaining this remarkable story is a lion’s share of what we will do in this book. The invention of the steam engine made a mockery of what was believed to be the limit of the speed at which heavy mechanical objects could be moved. In much the same way, the discovery of the electron, and in particular the discovery of the class of materials called semiconductors has made a mockery of what was believed possible for three of the deepest and most profound human endeavors: • computation: performing logic to produce information, • memory: storing information, and
• communication: transmitting and receiving information. Our understanding of the inner workings of electrons in semiconductors has given us powers beyond belief. We depend on semiconductors today to see and talk with family and/or friends on the other side of the planet. Semiconductors empower us to generate energy from sunlight, predict weather, diagnose and treat diseases, decode the DNA, design and discover new drugs, and guide our cars in the streets as deftly as satellites in deep space. They have placed the entire
1 1.1 Beyond belief
3
1.2 A brief history of semiconductors 4 1.3 Future
6
1.4 These boots are made for walking 7 1.5 Chapter summary section
8
Further reading
8
Exercises
9
4 And off We Go!
1 Harvard’s ”Division of Engineering
and Applied Sciences” or DEAS changed its name to SEAS, the ”School of Engineering and Applied Sciences”. No one wants to create a division between applied sciences and engineering indeed!
recorded history of information about the universe in the palm of our hands. In this book, we will dig in to understand how all of this is possible. The semiconductor revolution is one of the most remarkable human adventures in the entire recorded history of science. The revolution has been powered by the combined efforts of scientists and engineers from several fields. The contributions of mathematicians and physicists, of chemists and materials scientists, and electrical engineers have made this possible. We all acknowledge that the artificial labeling of our skills serves an administrative purpose1 , and has no basis in science! Nowhere is this highlighted more than in the field of semiconductors in which ”engineers” have been awarded Nobel Prizes in Physics, and physicists and chemists have founded successful ”engineering” semiconductor companies such as Intel on their way to becoming billionaires.
1.2 A brief history of semiconductors
Fig. 1.1 A brief history of electronic materials.
Fig. 1.2 J. J. Thomson discovered the electron in 1897 at the Cavendish Laboratory (see Fig. 25.3 too). Awarded the 1906 Nobel Prize in Physics. Seven of his students went on to win Nobel Prizes.
Fig. 1.1 shows a broad timeline of topics on the physics and applications of semiconductors that are relevant in this book. Insulators and metals were known centuries ago. Their properties were studied, and theories were developed to explain them. In Chapter 2, we will intercept this emerging story during the period right after the discovery of the electron in 1897 by J. J. Thomson (Fig. 1.2). The concept of the electron subject to classical laws of mechanics, electromagnetism, and thermodynamics was used by Paul Drude to explain many properties of metals that had remained mysterious till that point of time. It comes as a surprise to many today that superconductivity was experimentally discovered in 1911, decades before semiconductors were recognized as a distinct electronic state of matter. But our understanding of the physics of semiconductors developed much more rapidly than that of superconductors, as did their applications – as highlighted in Fig. 1.1. This book will make the reasons clear why this is so – that electrons in semiconductors are at first glance ”simpler” to understand because they act independently as single particles, whereas electrons in superconductors pair up and are strongly correlated. However, the simplicity of electrons in semiconductors was just a mask. Applications following the invention of the semiconductor transistor in 1948 led to a very high level of control, perfection, and understanding of semiconductors, which has led to the discovery of several layers of richer physical behavior. Esaki discovered electron tunneling in a solid – a genuine quantum mechanical effect highlighting the wave-nature of electrons – in semiconductor p-n diodes in 1957, around the same time Bardeen, Cooper, and Schrieffer proposed the eponymous BCS theory of superconductivity. The semiconductor transistor miniaturized the vaccum tube amplifier and made it far more compact, rugged, and energy efficient,
1.2
setting up the stage for its use for digital logic and memory, enabling computation. In much the same way, by understanding how electrons in semiconductors interact with photons, the invention of the semiconductor laser diode in the 1960s and 1970s shrunk the solid-state laser and revolutionized photonics, enabling semiconductor lighting and optical communications. In 1980, von Klitzing discovered the integer quantum Hall effect while investigating electron transport in the silicon transistor at low temperatures and at high magnetic fields. This was followed by the discovery of the fractional quantum Hall effect in 1982 in high-quality quantum semiconductor heterostructures grown by molecular beam epitaxy (MBE) by Tsui, Stormer, and Gossard. The quantum Hall effects revealed a whole new world of condensed matter physics, because under specific constraints of lower dimensionality, and the right electric and magnetic fields, the behavior of electrons defied classification into any of the previously identified phases: metals, insulators, semiconductors, or superconductors. The effort to classify the quantum Hall state has led to the unveiling of new electronic and magnetic phases of electrons in solids: based on the topology of their allowed energy bands. This line of thought has led to the discovery of topological insulators, the newest addition that significantly enriches the field of electronic phases of matter. The mathematical language of topology was developed in the early part of the 20th century. Slowly but surely, the deep connection of several physical phenomena such as electronic polarization, magnetization, and spin-Hall effects to the topological and geometric aspects of their quantum states are being discovered. The discovery of high-temperature layered cuprate superconductors in 1987 by Bednorz and Muller again rocked the field of superconductivity, and the field of condensed matter physics. That is because the tried-and-tested BCS theory of superconductivity is unable to explain their behavior. At the forefront of experiments and theory today, there is a delightful connection being established between topological insulators (that have emerged from semiconductors), and superconductivity (that have emerged from metals). There are now proposals for topological superconductivity, and semiconductor–superconductor heterostructures that can make electrons behave in much stranger ways – in which they pair up to lose their charge and spin identities and can ”shield” themselves from electric and magnetic fields. Such particle-antiparticle pairs, called Majorana fermions can enable robust quantum-bits (qubits) for quantum computation in the future. If all this sounds like science fiction2 – let me assure you it is not, the race is on to experimentally find these elusive avatars of the electron in many laboratories around the world!
A brief history of semiconductors 5
2 The 2009 movie Avatar showed hu-
mans trying to mine the fictional material ”Unobtanium” from Pandora – a room-temperature superconductor.
6 And off We Go!
Fig. 1.3 How semiconductor technologies have changed the handling of data and information over the last two decades.
1.3 Future Semiconductors today power the ”information age” in which data (or information) is produced in high volumes, and transported at the speed of light. Fig. 1.3 shows the underlying structure of information systems. In the 1990s, most computers were stand-alone devices that we interacted with and programmed directly. They boasted semiconductor integrated circuit-based microprocessors for computation and logic, semiconductor based random-access memories (RAMs), and magnetic (or spin-based) hard drives or read-only memories (ROMs) as two types of memory. The communication of data was achieved through wires directly connected to the computers. Today, the wires are slowly vanishing as they are replaced by photons – both of the microwave or RF kind that propagate through the atmosphere, and the infrared kind that propagate in optical fibers, shuttling massive amounts of data at the speed of light. A large and increasing part of the computing infrastructure resides in the ”cloud”. The cloud is loosely made of a large and distributed network of semiconductor microprocessors and memories that may be housed in server farms. Much of the heavy computational tasks and memory storage is thus not performed on the computers in front of us with which we directly interact. The communication links that connect our interface devices (or so-called ”Edge-Devices”) with the cloud are therefore increasingly important, and their bandwidth will determine the efficiency of such networked information systems. Just as a machine as complicated as a car needs to be driven, the information system needs instructions in the form of computer and system ”programs”, or ”code”. These instruction sets, no different in principle than sets of instructions to drive a car, set the whole system in motion and control its behavior, just as a driver weaving through traffic. The computational and memory capacity created by minia-
1.4
turization of transistors and memory elements, and large bandwidth communication links powered by semiconductor lasers has also led to unexpected outcomes, such as an explosion of social media. The first semiconductor ”point-contact” transistor of 1947 was a few centimeters in size. Fig. 1.4 shows this device in comparison to a much newer transistor – the Silicon FinFET of 2012. The transistor is so small today that you can count the number of atoms across the fin – the scale has shrunk by a factor 107 from centimeter in 1947 to nanometer in 2012. In the latest generation microprocessors, several billion transistors can fit into a cm-size chip. The orders-of-magnitude increase in computational power and storage capability is a direct result of this miniaturization. Such orders of magnitude improvements over sustained periods of time is extremely rare in the recorded history of science. Where are things going next? Some semiconductor experts and accomplished practitioners in this art may tell you that the golden era of semiconductors is over. That transistors have reached such small dimensions and semiconductor lasers are so efficient that there is not much more one can do. Do not buy into such predictions, because most such predictions can be proven to be wrong3 . Come back and read this chapter once more. Just like it has happened again and again, some of you will do experiments or come up with a theory that will make a mockery of all current belief of what is possible with semiconductors and related materials. The future of our civilization depends on this adventure that you must embark upon! This book is an effort to motivate you to take this journey by arming you with the science that will serve as your fuel.
These boots are made for walking 7
3 “It is exceedingly difficult to make pre-
dictions, especially about the future” – Danish proverb, sometimes attributed to Niels Bohr invoking the uncertainty principle in his answer to the question – “What will be the impact of quantum mechanics on the future of the world?”
1.4 These boots are made for walking This book is meant to be read with enjoyment and wonder, the concepts discussed in and outside of class, in office hours, and over coffee. Every effort will be made to present the development of the field as a uniquely human endeavor – laws and theorems did not fall out of the sky, but are creations of precise experimental observation and recognition of mathematical patterns – both achieved by human ingenuity. Please email me ([emailprotected]) your feedback and suggestions, as they go a long way in making sure that the lofty goals of this book are indeed achieved. The book is divided into four modules, or parts: • Part I: Fundamentals: In this module, in Chapters 1 through 7, we develop the basic concepts of quantum mechanics and quantum statistics, with notions of perturbation theory to apply to semiconductor physics. • Part II: Bands, Doping, and Heterostructures: In Chapters 8 – 15, the core concepts of electronic states in semiconductors, heterostructures, and nanostructures are developed, culminating in
Fig. 1.4 The first centimeter-scale point-contact transistor, and today’s nanometer-scale FinFETs.
8 Further reading
Chapter 16 that summarizes, and connects the first half of the book to the next. • Part III: Quantum Electronics: In Chapters 17– 25, the quantum concepts are applied to semiconductors and nanostructures to explain several electronic phenomena, with applications to diodes, memory, and transistors. • Part IV: Quantum Photonics: In chapters 26– 29, light-matter interaction is discussed, and the quantum processes responsible for solar cells, light emitting diodes (LEDs), and lasers are discussed, and Chapter 30 discusses a possible future of the field.
1.5 Chapter summary section At the end of each chapter, there will be a boxed summary emphasizing the major concepts that we have learned:
• This book is meant to explain the physics of semiconductors, and connect the physics to several applications such as diodes, transistors and lasers. • The book is divided into four modules: (I) Fundamentals, (II) Semiconductor Physics, (III) Quantum Electronics with Semiconductors, and (IV) Quantum Photonics with Semiconductors. • We discussed how to approach the contents in the book. For effective learning, there is no alternative to working through the concepts in the chapters, and solving the Exercises at the end of each chapter. I have purposely shared historical contexts and anecdotes in the text with the hope that familiarity with the stories and personalities behind the science and technology will lead to a deeper understanding and appreciation of the subject.
Further reading At the end of each chapter, there is a short section on recommended readings that amplify the discussions in the chapter: Thirty Years that Shook Physics (George Gamow) gives a delightful and readable account of how quantum mechanics was developed, complete with a stage play at the end. For a deep insight into the development of semiconductor physics starting from quantum mechanics as the technologically successful arm of condensed matter physics, and for the persons and organizations
instrumental in its development, Crystal Fire (Riordian and Hoddeson) and True Genius (Daitch and Hoddeson) are recommended. The Story of Semiconductors (Orton) traces the development of semiconductors and their science and technology up to the present time. The Physics of Semiconductors (Grundmann) has an excellent timetable of semiconductors, and broadly is an excellent companion for the topics covered in this book. For those who want to understand how deep science and engineering, entrepreneurship, inspirational management,
Exercises 9 and a bit of luck goes hand in hand to commercialize technological breakthroughs in semiconductors, the following two books are highly recommended: The Idea Factory (Gertner), and Fire in the Belly: Building a
World-leading High-tech Company from Scratch in Tumultuous Times (Neal and Bledsoe). For details of each chapter-end reference, please refer to the end of the book.
Exercises (1.1) Cat’s whiskers Given the preponderance of cat videos on the internet today, it is ironic that the same semiconductors that power today’s web got their earliest break in industry in a device that used properties of the cat’s whisker (Fig. 1.5). Research the web and write a short summary of how the earliest semiconductor diode rectifiers changed the radio, and the specific need for the physics of feline whiskers.
Fig. 1.5 Cat’s whisker, the shape that launched the first semiconductor devices used for crystal-radios.
(1.2) Moore’s law In 1965, Gordon Moore working at Fairchild Semiconductor at the time (and to-be co-founder of Intel) made a prediction on the scaling of semiconductor transistor sizes and their packing density on chips with time. Though Moore’s law is not a law of mathematics or physics such as the law of gravitation, it does capture an economic and computational argument that has remained approximately true for half a century. Perform some research and write a short summary of Moore’s law, its implications, and its inevitable and ultimate demise. (1.3) Mining semiconductor materials The most heavily used semiconductors today for high-performance electronic and photonic devices are silicon (Si), silicon carbide (SiC), germanium (Ge), gallium arsenide (GaAs), and gallium ni-
tride (GaN). Write a short critique, discussing their abundance on earth and in the universe, and how some needs drive the search for alternative options. (1.4) Take two: John Bardeen
Fig. 1.6 John Bardeen, a founding father of both semiconductor physics and of the physics of superconductors. He is the only person in history to have been awarded two Nobel Prizes in Physics: in 1956 for the invention of the semiconductor transistors, and in 1972 for the theory of superconductivity.
John Bardeen made pioneering contributions to the physics of semiconductors. He was a co-discoverer of the transistor with Brattain and Shockley. Perform research on his life and contributions to semiconductor physics, and write a paragraph summarizing them. (1.5) Let there be light In 1907, H. J. Round reported the observation of the emission of light from a metal/semiconductor rectifier. Read the (entire) report in Fig. 1.7, and do research on how the same principle of light emission from SiC remained the principal means of obtaining blue light at relatively low efficiencies from semiconductors till the breakthroughs in the 1990s in the gallium nitride semiconductor family revolutionized solid-state lighting.
10 Exercises this does not do too much violence. The reader can use her or his judgement with a lightning fast units check in case of confusion. Planck’s constant h . In photonics, freis h, and ”h-bar”, or h¯ = 2π quency is represented by ν (in hertz), and ”circular frequency” by ω = 2πν in cycles/s. This means hν = h¯ ω can be used interchangeably for energy. In electronics, f is preferred for frequency in hertz, and transistor speeds are stated as f T and f max .
Fig. 1.7 Round’s 1907 report on his surprising observation of the emission of light on applying voltage across a crystal of carborundum, which we now know is the compound semiconductor silicon carbide (SiC). This was the first semiconductor light-emitting diode or LED. LEDs made with the semiconductor gallium nitride (GaN) has revolutionized the lighting industry in the last decade.
(1.6) Units and symbols: safety protocols! The limited number of alphabets in the English language makes it inevitable that some notations end up being used to represent more than one quantity. Also, different fields use different units of measure: and for good reason. Imagine a person working on semiconductors having to measure transistor gate lengths in light years, or a cosmologist having to state the distance between galaxies in nanometers! This problem alerts you to such instances in this book to avoid the pain or confusion they may cause when you encounter them. (a) Symbols: The most painful choice I had to make in writing this book is for the symbol of the magnitude of the electron charge: e vs. q. The sign is another matter, which we discuss after the symbol. I chose q because the use of e as Euler’s number 1 e = ∑∞ n=0 n! = 2.718... in the the free electron wavefunction ∼ eik·r . The time of reckoning for this choice comes in later chapters (such as Chapter 23 on scattering and mobility), when the symbol q = k i − k f is the difference of electron wavevectors before and after scattering. We take refuge by switching back to e for a few sections there. I hope
(b) Signs: We owe this one to Benjamin Franklin for choosing the charge of the electron to be negative. In this book the charge of the electron is −q. Generally this should not be a problem, except when the minus signs appears in front of every expression, making the book look like a top secret military cryptogram. For almost every case, the signs in the text respect Franklin’s choice. But there are at least one or two instances where I have committed the necessary sin of leaving the negative sign out when it does not hurt. (c) Language: In this book, bandstructure and band diagram mean different things. The bandstructure is the E(k) energy eigenvalue spectrum of a semiconductor. The energy band diagram is the spatial variation of the conduction band minimum Ec ( x ) and valence band maximum Ev ( x ). Under no instance should they be confused with one another. (d) Units: We use SI units throughout. But in semiconductors, length is expressed in cm, microns, or nanometers depending on the natural scale of the physical quantity under consideration. For example, electron or hole densities are expressed in cm−3 , photon wavelengths in microns, and gate lengths of small transistors in nanometers. (1.7) Semiconductor cheat-sheets: where to find them (a) Tables A.1, A.2, and A.3 in the Appendix of this book listed at the end contain the physical properties of several semiconductors of the 3D and 2D form. Study these tables and use them. (b) Table A.4 in the Appendix has the numerical values of physical constants that will be needed throughout the book. (b) Fig. A.1 in the Appendix at the end of this book lists the constitutive equations that govern the quantum physics of semiconductors and their electronic and photonic device applications.
Secrets of the Classical Electron I would like to emphasize from the beginning how experiments have driven the search for new theories. New theories are born when experimental facts defy the sum total of all existing theoretical models. This happens again and again because when you perform an experiment, you cannot stop a physical law from doing its thing. Meaning, any experiment ever performed has measured every known law of physics, and every yet to be discovered law of physics! So when you measure something in regimes no one has ever done before, because of better equipment, or because you are just plain clever, new physical phenomena reveal themselves to you if you are a careful researcher. We will repeatedly observe these sorts of dynamics at play. Because understanding comes from a quantitative theory that can explain experiments, and a quantitative theory is written in the language of mathematics, at times experimental observations languish till the development of the required new mathematics. For example, Newton had to develop the mathematical language of calculus to quantitatively explain the forces that drive the experimentally measured paths of planets. In a similar vein, for a quantum mechanical understanding of semiconductors and nanostructures, some notions of matrix based mathematics and Hilbert spaces will be needed in the book. But that can wait, as we can say so much by resting on the shoulders of Newton, Boltzmann, and Maxwell. In this chapter we glimpse how the observed electronic, thermal, and photonic properties of metals were explained in the pre-quantum era using the three pillars (Fig. 2.1) on which physics rested around 1900s: Newton’s Classical Mechanics, Boltzmann’s Thermodynamics, and Maxwell’s Electromagnetism. But then clever experimentalists pushed measurements to new regimes revealing physical phenomena that were in stark contradiction to these theories. Let’s start talking about metals with the following questions:
2 2.1 Our ancestors knew metals
12
2.2 Discovery of the electron and its aftermath 12 2.3 Drude’s model explains Ohm’s law 12 2.4 Metals are shiny
15
2.5 Metals conduct heat
16
2.6 Icing on the cake: mann–Franz law
The Wiede17
2.7 All is not well
18
2.8 Chapter summary section
19
Further reading
20
Exercises
20
• Why do metals conduct electricity and heat so well? • Why are metals shiny? • Why is the ratio of thermal and electrical conductivities a constant for most metals? Fig. 2.1 Three pillars of classical physics.
12 Secrets of the Classical Electron
2.1 Our ancestors knew metals 1 All these properties were known be-
fore the discovery of the electron, or the atom.
Because nearly two-thirds of the elements in the periodic table are metals, and many occur in nature, they were discovered early, well before the 1900s. Metals have been known for a very long time1 to be very different from insulators by being: • good conductors of electricity, • good conductors of heat, and • reflective and shiny.
Wiedemann and Franz in 1850s experimentally discovered a deep and mysterious connection between the thermal conductivity κ, and the electrical conductivity σ of metals. They found that: 2 This experimental fact is known as the
empirical Wiedemann-Franz law.
3 The word ”electron” represents an in-
divisible portion of ”electricity” – the word electricity predates the word electron by several centuries. It motivated a number of ”..on” names: photon, proton, neutron, phonon, fermion, boson...
4 Protons help maintain charge neutrality
of the atom, and neutrons stabilize the nucleus and keep the atom from disintegrating. Protons and neutrons are not elementary particles like the electron; they are composed of quarks.
Fig. 2.2 Paul Drude in 1900 proposed a model that combined classical mechanics, electromagnetism, and thermodynamics to explain the many properties of metals by invoking the then newly discovered electron. 5 Hence the name ”electron gas”.
κ V 2 • The ratio σT ∼ 10−8 ( K ) , where T is the temperature, is a con2 stant for many different metals.
But why? Though there were several attempts to explain all these physical characteristics of metals, a truly satisfying explanation had to wait for the discovery of the single entity responsible for every one of the properties of metals discussed above: the electron3 .
2.2 Discovery of the electron and its aftermath J. J. Thomson (Fig. 1.2) discovered the electron in 1897 in the Cavendish laboratory. He found the existence of a particle of mass me = 9.1 × 10−31 kg and electrical charge −q, where q = 1.6 × 10−19 Coulomb is the magnitude of the electron charge. This was followed by the discovery of the nucleus4 by Ernest Rutherford in 1911, and the neutron by James Chadwick way later in 1932. In this chapter, the nucleus of the atom will play a cameo role of passive obstructions in the path of freely moving electrons in a metal. It will increasingly assert its role in later chapters. Today, the electron is known to be an elementary point-particle (of zero radius), and in addition to its mass and charge, also has an intrinsic angular momentum called the spin. Exactly how a point particle of no radius can have these properties is still under investigation, we will encounter their consequences in Chapter 3. In this chapter, we assess the consequences of its classical charge and mass.
2.3 Drude’s model explains Ohm’s law The discovery of the electron was the trigger that precipitated an explanation of most properties of the metal discussed in Section 2.1. This was achieved by Paul Drude (Fig. 2.2) , who applied the notion that electrons move in a metal just like molecules move in a gas5 follow-
2.3
Drude’s model explains Ohm’s law 13
Bi Ba Sb Cs
Pb
Sr Nb
Rb K
Li
Rb Cs
Fe
Zn Na Mg Be Ca Au Al Cu Ag Cd
Ag K
Ga
In
Na
Cu
Ca
Al
Be Zn Mg Li Fe Nb Ga Sr Pb Ba
Sb Bi
Fig. 2.3 (a) Electronic resistivity ρ = 1/(qnµ), and (b) conductivity σ = 1/ρ = qnµ of various metals at T = 273 K, and (c) a comparison of resistivities of metals to semiconductors as a function of mobile electron concentration n. The lines are of constant mobility µ in units of cm2 /(V·s). Semiconductors typically have ∼ 2 − 10 orders lower mobile carrier concentrations than metals. Unlike metals in which the mobile carrier concentration n is fixed, in a semiconductor n can be varied by various methods such as doping, field-effect, heat, or light. The variation in n shown here for Silicon is obtained by varying the concentration of Phosphorus doping. The data for metals is adapted from Solid State Physics by Ashcroft and Mermin.
ing the laws of classical mechanics, thermodynamics, and electromagnetism. The first task was to explain the electrical conductivity σ of metals. Fig. 2.3 shows the measured resistivities and mobile carrier concentrations for several metals. The resistivities ρ = 1/σ of most metals vary between 1 < ρ < 100 µΩ-cm. The experimental fact measured for metals is that the electric charge current I flowing through them is linearly proportional to the voltage V across them. This is stated as Ohm’s law V = IR, where R is the resistance of the metal as shown in Fig. 2.4. Current is measured in amperes (or coulombs/second), voltage in volts, and resistance in ohms=volts/amperes. Drude imagined that the metal is filled with electrons of volume density in cm−3 units n = N V , where N is the total number of electrons in the metal, and V = AL is the volume as indicated in Fig. 2.5. The current density J measured in amp/cm2 is defined by I = J A. The expression for the
Fig. 2.4 Ohm’s law is V = IR, or equivalently J = σE.
14 Secrets of the Classical Electron
6 Why does every electron in the universe
have exactly the same charge? Paul Dirac, who we will encounter in the next chapter offered a delightful explanation for this observation based on quantum mechanics and topology, but it requires the existence of a magnetic monopole, which is yet to be observed!
7 In this book, we will exclusively use
SI units. In semiconductor physics, the length is often in cm instead of m. For example, particle densities are in /cm3 , and electric field in V/cm.
Fig. 2.5 Electron gas moving in response to an electric field in a metal.
current density is J = qnv . q →: The charge current density is given by the particle flux density nv times the charge of the particle, q. The flux density of particles is the volume density times the velocity, n × v. Each electron in this flux6 drags along with it a charge of q, so the current density is J = qnv. Later when we encounter heat, spin, or other currents, we will simply multiply the particle flux density with the corresponding quantity that is dragged along by the particle. In this book, we use the notation q = +1.6 × 10−19 coulomb. So the charge on the electron is −q, and on a proton is +q. n →: Because the structure of the atom was not known at the time, the electron density n is an empirical number, of the order 1023 /cm3 in Drude’s model7 . The density can, however, be experimentally measured by the Hall effect, which gives both the density n and the sign of the particles carrying current. We will see later that metals have far more electrons than ∼ 1023 /cm3 in them, with the number increasing as we go downwards in the periodic table to heavier elements. Most electrons however are stuck in core states or filled bands – concepts we develop in subsequent chapters as they require quantum mechanics to explain. Some electrons are free to wander around the crystal and conduct charge current – those are the conduction electrons that Drude’s quantity n considers, the others simply do not play a role. v →: Drude found the average velocity of the electrons in the metal with the following argument. The force on the electrons is due to the electromagnetic, or Lorentz force F = (−q)(E + v × B), where E is the electric field, and B is the magnetic field. If B = 0, let’s assume the scalar form for simplicity where |E| = E = −∇V = −dV/dx = V/L is the electric field exerted on the electrons by the battery. Because dp of this force, electrons accelerate according to Newton’s law F = dt , where p = me v is the electron momentum. But as they speed up, sooner or later they will bump into the atoms in the metal, just as a swarm of bees drifting blindly through a forest. Whenever such a collision occurs, the momentum of the electron is randomized, and it starts accelerating along the direction of the force again. The modified Newton’s law accounting for the dissipation or damping of momentum every τ seconds on an average is:
(−q) E = me
8 The electrons impart momentum to the
atoms upon collision. At high current densities this momentum transfer can physically move the atoms. This phenomenon is called electromigration, and is a real problem in microprocessors.
dv me v − , dt τ
(2.1)
where the increase in momentum due to the force is tempered by the decrease upon collisions every τ seconds8 . If we ask what happens in the ”steady state”, meaning we have applied the voltage and waited d long enough that all transients have died out, we can use dt (...) → 0, which yields the velocity v as v=
qτ qτ E = µE =⇒ µ = . me me
(2.2)
2.4
The electrons achieving an ensemble steady state velocity in the presence of a constant force is similar to a parachutist reaching a terminal velocity in spite of the constant gravitational force9 . The drift velocity is proportional to the electric field, and the constant of proportionalqτ ity µ = me is defined as the mobility of electrons10 . The concept of mobility will prove to be more useful for semiconductors than metals, as we will see later. It is clear that if the electron scatters less often, τ ↑ =⇒ µ ↑. Now putting together all the above pieces, Drude found that the current density is J = qnv =
nq2 τ nq2 τ E = σE =⇒ σ = , me me
Metals are shiny 15
9 A more apt analogy is the average ve-
locity of a car in a stop-and-go traffic since electrons accelerate every time between collisions. 10 Since the mobility is the ratio of the
velocity to electric field µ = v/E, its unit as used in this book is (cm/s) / (V/cm) = cm2 /V·s.
(2.3)
where it is seen that the current density is proportional to the electric nq2 τ field, with the proportionality constant σ = me called the electrical conductivity. The components appeal to our intuition - more electrons, and longer scattering times should lead to higher conductivity. Electrical conductivity does not depend on the sign of the charge q. To close the story, we revert back to the current and see how Drude’s model explained Ohm’s law of the electrical conductivity of metals: V 1 L 1 L L A =⇒ V = I · = IR =⇒ R = =ρ . L σA σA A (2.4) The resistance R is measured in ohms. It is related to the microscopic details provided by Drude via the conductivity σ or the resistivity ρ = σ1 , and to the macroscopic dimensions11 via L and A. A longer metal has more resistance because it takes an electron longer to get through due to more scattering obstructions. This is the essence of classical mechanics applied to the electron. I = J A = σEA = σ
11 This concept works well for large re-
sistors. But we will see later that this picture will fail spectacularly when the L and A become very small, comparable to the wavelength of electrons. These anomalies were observed in 1960s leading to the observation of quantized conductance.
2.4 Metals are shiny Why do metals reflect light? The secret lies in the swarm of conduction electrons in them! Maxwell (Fig. 2.6) had shown that a beam of light is a wave of oscillating electric and magnetic fields. The time dependent electric field of a light beam of amplitude E0 is E(t) = E0 eiωt , where the circular frequency ω = ck = c 2π λ = 2π f is linked to the speed of light c and its color, or in other words its wavelength λ. If we subject the swarm of electrons in the metal not to the constant DC voltage of a battery as we did in Section 2.3, but to this new pulsating field, we can explain why metals reflect light. Instead of me solving this problem, can you do it for yourself? Sure you can. I have outlined the method in Exercise 2.5 at the end of this chapter. In a few steps you can explain the luster of metals, and also the colors of some!
Fig. 2.6 James Clerk Maxwell in 1865 unified electricity and magnetism. By introducing the concept of the displacement current, he showed that light is an electromagnetic wave. He made significant contributions to several other fields of physics and mathematics.
16 Secrets of the Classical Electron
2.5 Metals conduct heat
12 Diamond, BN, SiC, and AlN are ex-
ceptions, to be discussed later.
Fig. 2.7 Figure showing how a uniform density of electrons n in a metal can transport heat energy from the hot to the cold side.
Drude’s model of electrical conductivity leads naturally to an explanation of the thermal conductivity of metals. As a rule of thumb for metals, good electrical conductors are also good thermal conductors. For example, copper has one of the highest electrical conductivities at σ = 1/ρ ≈ 1(µΩ · cm )−1 , and a high thermal conductivity of κ ≈ 4 W/cm·K. Atoms likely play a small role in the thermal conductivity of metals because the density of atoms are similar in metals and insulators. Because typical electrical insulators are also poor thermal conductors12 , the thermal conductivity of metals must be due to the conduction electrons. Based on this reasoning, a model for the thermal conductivity of metals due to electrons alone goes like this: consider a piece of metal as shown in Fig. 2.7. The volume density of electrons n = N V is uniform across the metal, but the left end is held at a hotter temperature T1 than the right end, which is at T2 . There is a temperature gradient across the metal from the left to the right. Drawing analogy to the charge current density Jcharge = σ · (−∇V ), where the potential gradient −∇V = E is the electric field, and σ the electrical conductivity, we hunt for an expression for the heat current in the form Jheat = κ · (−∇ T ), where ∇ T is the temperature gradient. If we can write the heat current in this form, we can directly read off the thermal conductivity κ. Because the heat current density Jheat has units cmJ 2 s = Watt , the thermal cm2 conductivity κ must have units of Watts . cm·K Looking at Figure 2.7, we zoom into the plane x, and ask how much heat energy current is whizzing past that plane to the right. Because the density of conduction electrons is uniform in space n( x ) = n, the heat current flows because of a difference in energy via the temperature gradient. Let E be the individual electron energy. The energy depends on x via the temperature of electrons at that plane. We shall call the average length traversed by the electron between scattering events the mean free path. In the plane one mean free path v x τ to the left of x, the energy of electrons is E [ T ( x − v x τ )]. Electrons in this (hotter) side have energies higher than those one mean-free path to the right of plane x, which have the energy E [ T ( x + v x τ )]. Half the carriers at x − v x τ are moving to the right, carrying a heat current density n 2 v x E [ T ( x − v x τ )]. Similarly, half the carriers at x + v x τ transport heat current n2 v x E [ T ( x + v x τ )] to the left. The net heat current at x is the current flowing to the right, minus the current flowing to the left: n Jheat = v x [E [ T ( x − v x τ )] − E [ T ( x + v x τ )]]. (2.5) 2 Now if we assume that the mean free paths v x τ are small, we can write the heat current as n ∆E ∆T n dE dT Jheat = v x ∆x = v x (−2v x τ ). (2.6) 2 ∆T ∆x 2 dT dx We use the fact that because of isotropic motion of electrons in three
2.6 Icing on the cake: The Wiedemann–Franz law 17 2
dE dimensions, v2x = v2y = v2z = v3 . To obtain the term dT , we connect it to U, the definition of the electron specific heat, which is given by cv = V1 ddT N where U = N E is the total energy of the electron system, and n = V . The physical meaning of the specific heat is the amount of energy that must be provided into the electron system to cause a unit degree rise in its temperature. For free classical motion of the electron gas in three dimensions, classical thermodynamics asserts an equipartition of energy, prescribing 12 k b T of energy for motion in each degree of freedom, or dimension of motion. Here k b = 1.38 × 10−23 J/K is the Boltzmann constant, the fundamental quantity underpinning classical thermodynamics. So 12 me v2x = 12 k b T, and the total kinetic energy of each electron is then U = 12 me v2 = 12 me (v2x + v2y + v2z ) = 32 k b T. The
specific heat of the electron gas is then seen to be cv = 3 2 nk b . We can write the heat current density as: Jheat =
1 cv v2 τ (−∇ T ) = κ (−∇ T ). 3
3 1 d( N × 2 k b T ) V dT
=
4
3
(2.7)
In the form of Equation 2.7, the Drude model of heat current carried by electrons in a metal gives us a thermal conductivity κ = 13 cv v2 τ by analogy to the charge current. But the Drude model does more: it also gives a quantitative explanation for the ratio of electrical and thermal conductivity.
2
1
Li Na K RbCsCuAgAuBeMgCa Sr BaNbFeMnZnCdHg Al Ga In Tl SnPb Bi Sb
0.6 0.5 0.4 0.3 0.2
2.6 Icing on the cake: The Wiedemann–Franz law
0.1 0.0
Li Na K RbCsCuAgAuBeMgCaSr BaNbFeMnZnCdHg Al Ga In Tl SnPb Bi Sb
3.5
One of the major successes of the Drude electron model of electrical and thermal conductivity of metals achieved was to provide an explanation for the Wiedemann–Franz law, which had languished for half a century without an explanation. From Equations 2.3 and 2.7, we get the ratio
( 13 32 nk b 3kmbeT τ ) ( 13 cv v2 τ ) κ 3 k = nq2 τ = = ( b )2 =⇒ nq2 τ σT 2 q ( me ) T ( me ) T
3.0 2.5 2.0 1.5 1.0 0.5 0.0
κ 3 k = ( b )2 = L. σT 2 q
(2.8) Rather amazingly, every microscopic detail specific to a particular metal such as the electron concentration n and the scattering time τ cancels out in the ratio above! What remains are the two fundamental constants that underpin classical physics: the Boltzmann constant k b , and the electron charge q. This seemed to explain the Wiedemann– Franz law beautifully (see Fig. 2.8). The ratio is called the classical V 2 Lorenz number L with a value ∼ 10−8 ( K ) . Thus, Drude’s model of a classical electron gas seems to have resolved the mystery we set out with in Section 2.1. Or has it?
Li Na K RbCsCuAgAuBeMgCaSr BaNbFeMnZnCdHg Al Ga In Tl SnPb Bi Sb
Fig. 2.8 (a) The thermal conductivity of various metals, (b) the corresponding electronic conductivity, and (c) the demonstration that κ/(σT ) is the same for metals. The quantum Lorenz number 2 Lq = π3 ( kqb )2 is shown as the dashed line, highlighting the Wiedemann-Franz law. The quantum Lorenz number is roughly twice of the classical value of Equation 2.8, and is discussed later.
18 Secrets of the Classical Electron
2.7 All is not well
13 Huxley: The great tragedy of science:
the slaying of a beautiful hypothesis by an ugly fact.
The success of the Drude model in explaining the Wiedemann– Franz law is remarkable, but unfortunately it is fundamentally flawed. Note that it does the best job with all the experimental data that was available till that time. We will see later that there is a crucial cancellation of two unphysical quantities that leaves the ratio intact. When experimentalists measured the specific heat cv of electrons in metals, it was found to be much smaller than the classical result cv = 32 nk b obtained by Drude. The experimental value is found to be roughly 1/60 of the classical result. This is a null result13 that hints at something deeper – and we will see soon that something is quantum mechanics. We will see in subsequent chapters that by demanding that electrons follow the rules of quantum mechanics instead of Newtonian mechanics, the electronic heat capacity was shown by Arnold Som2 merfeld to be cv = [ π2 nk b ] · [ kEbFT ] instead of the classical cv = 32 nk b . Here EF =
p2F 2me
is the Fermi energy, p F = h¯ k F the Fermi momentum,
1 (3π 2 n) 3
and k F = is the Fermi wavevector. These three quantities are quantum-mechanical concepts, and simply cannot be explained by classical physics. Aside from the constants in the correct expression for the electron heat capacity, only a small fraction of the n conduction electrons: ∼ kEbFT to be precise, seem to actually contribute to the heat capacity. We now know the reason: electrons being fermions, are subject to the Pauli exclusion principle. This principle prevents two electrons from occupying the same quantum state in the metal. The consequence of the exclusion principle is rather drastic on the distribution of electrons. Fig.2.9 highlights this difference. In classical statistical physics, the number of electrons at an energy E goes as e− E/kb T , which is the Maxwell–Boltzmann distribution. So most electrons will pile up at the lowest allowed energies E → 0 to lower the system energy. When extra energy is pumped into the electron system by whatever means, electrons at all energies in the distribution have equal opportunity to increase their energy. This is what led to the d( N × 3 k T )
Fig. 2.9 Figure showing how the much higher heat capacity of electrons comes about because of assuming the classical Maxwell-Boltzmann distribution in energy. The Pauli exclusion principle of quantum mechanics changes the distribution to the Fermi-Dirac distribution, which fixes the electronic specific heat anomaly of the Drude model.
2 b classical heat capacity cv = V1 = 32 nk b . dT However, when electrons are forced to obey the Pauli exclusion principle, when an allowed state is occupied by an electron, it is impossible for a second electron to occupy the same state. The second electron must occupy a higher energy state. If we continue filling the states till the last electron, the highest energy occupied at T = 0 K is referred to as the chemical potential µ. While the chemical potential does not depend on temperature, the Fermi energy EF ( T ) does: it captures the increase in the energy of the electron distribution with heat. The two energies are equal at absolute zero, µ = EF ( T = 0). The occupation probability of a state of energy E upon enforcing the Pauli exclusion E−µ
principle is the Fermi–Dirac distribution, f ( E) = 1/(1 + e kb T ), whose maximum value is 1. This is shown in Fig. 2.9. We will discuss this
2.8 Chapter summary section 19
distribution in significant detail later. The Fermi–Dirac distribution of electrons makes it clear why only a fraction kEbFT of electrons in a metal can actually absorb energy and promote themselves to higher energy states. Because of the Pauli exclusion principle, none of the electrons in the energy window 0 < E / EF − 3k b T can increase their energy by absorbing k b T energy, because they are Pauli-blocked. Electrons in only a tiny sliver of energies k b T around the Fermi energy EF have the freedom to jump to higher energy states that are unoccupied. Thus, the electronic heat capacity cv is much smaller than what was predicted by the Drude model. We will also see in Exercise 2.3 that the sign of the current carrying charge as measured by the Hall effect should always be negative for the charge of electrons, but it is measured to be positive for some materials. This flipping of sign also needs quantum mechanics to explain, as it is a result of the formation of energy bands. Finally, the best thermal conductor is diamond, with κ ∼ 22 W/cm·K, roughly five times of copper, and it is an excellent electrical insulator! This goes smack at the face of the Wiedemann–Franz law because heat in some materials is not carried by electrons, but by lattice vibration quanta called phonons. In the next chapter, we begin our quantum journey and see why electrons must follow the exclusion principle and the Fermi–Dirac distribution in the first place.
2.8 Chapter summary section In this Chapter, we learned:
• The electronic and thermal conductivities of metals were explained by the Drude model by attributing these properties correctly to the newly discovered particle, the electron. • In the Drude model, free electrons subject to the laws of classical mechanics, electromagnetism, and thermodynamics could explain the electronic conductivity σ, and the thermal conductivity κ reasonably successfully. • The Drude model also seemed to resolve a long-standing mystery κ of the empirical Wiedemann–Franz law, which stated the ratio σT is a constant for metals. • The heat capacity of free conduction electrons in metals predicted by the Drude model turned out to be inconsistent with the measured values, which were several orders too small. It would need the full machinery of quantum mechanics and a quarter century to resolve this discrepancy.
20 Exercises
Further reading For insightful and comprehensive discussion of the theory of metals, the classic text Solid State Physics by Ashcroft & Mermin remains unmatched in covering everything that was known till the early 1980s, and is highly recommended for critical reading. The Theory of the Properties of Metals and Alloys by Mott & Jones, a much older text, shows the earliest application of quantum mechanics to the understanding of metals and alloys. For a historical review of how the understanding of metals de-
veloped in its early times, Reviews of Modern Physics, vol. 59 pg. 287 (1987) by Hoddeson, Baym, and Eckert is an enjoyable read. Finally, the classic article that introduced the electron theory of metals to the world in its complete form, Electronentheorie der Metalle by Sommerfeld and Bethe, in Handbuch der Physik 1933 remains one of the most comprehensive to date. It is in German, and still worth a scan even if you don’t know the language!
Exercises (2.1) Metals, insulators, and stars in the universe (a) Recent estimates put the number of stars in the universe at ∼ 1030 , dark matter notwithstanding. Show that the span of electrical resistivities ρ = σ1 in the best metals to the best electrical insulators is larger than the number of stars in the universe. Electrical resistivity of the best metals are in the range of ρ ∼ 10−6 ohm·cm. (b) It is useful to find combinations of fundamental constants that yield physical quantities. Show that for electrical resistivity, such a ratio is ρQ = qh2 a0B where h is the Planck’s constant, q the electron charge, and a0B the Bohr radius of the Hydrogen atom. Show that the value of this ”quantum” resistivity is roughly 100 times higher than the lowest resistivity metals.
(2.2) Superconductivity In 1911, Heike Kamerlingh Onnes discovered that some metals when cooled to low temperatures lose their resistivity completely, meaning ρ = 0 exactly. The near-magical properties of superconductors are the source of as much endless fascination for scientists as for the general public. For example, they expel all magnetic fields in them, and float in thin air defying gravity.
Fig. 2.10 Levitating superconductor.
The levitation images (Fig. 2.10) are near iconic and trains have already been built on this concept. In this book, we will not discuss the physics of superconductivity, which has to do with the pairing of electrons to form a highly correlated state, which makes the current flow without dissipation. Unfortunately to date superconductivity has been observed only at low temperatures. But we sure can dream! Imagine tomorrow morning you wake up to the news that a room-temperature superconductor has been discovered: that news and its implications will surely change the face of the earth. What would you do with it? (2.3) Who’s in charge? The Hall effect In 1879, nearly twenty years before the discovery of the electron, Edwin Hall (Fig. 2.11) during his
Exercises 21 PhD work at the Johns Hopkins University made a most useful discovery. In Chapter 25, a detailed treatment of the Hall effect and behavior of electrons in magnetic fields is given. This Exercise is an early introduction to these phenomena.
which is independent of the sample dimensions and the magnetic field. The Hall coefficient thus is a direct measurement of the mobile carrier concentration in the material. Also, this simple model suggests that the Hall coefficient does not depend on any details of scattering events, or the magnetic field. Though it is mostly correct, metals can demonstrate varying degrees of magnetoresistance, but do not deviate too far from Equation 2.10, making it a trustworthy measurement of the carrier density and the sign of its charge. We will come back to the discussion of the Hall effect, and why even in materials in which electrons carry the current, we can experimentally measure a positive sign of the carriers!
Fig. 2.11 Edwin Hall discovered what is now known as the Hall effect in 1897 by detecting a lateral voltage in the current carrying gold leaf due to a magnetic field. This effect is now a heavily used method to measure the carrier concentration and mobility of current-carrying particles in all sorts of materials. Several new Hall effect variants have been uncovered, such as the Quantum Hall effect, the spin-Hall effect, etc. These are described in Chapter 25.
Consider a conductor of thickness Lz , length L x , and width Ly respectively. A current I flows along the length, and a magnetic field Bz is applied along the thickness. The Lorentz force due to the electric and the magnetic field acts on the charges carrying the current. (a) Show that if the volume density of charge carriers in the conductor is n, then a voltage develops in the direction perpendicular to the current, and to the magnetic field. This voltage is called the Hall voltage Ix Bz . (2.9) qnLz (b) Show why the sign of the Hall voltage is an experimental measurement of the sign of the charge of the current carrying particles. (c) Show that the ratio of the lateral electric field Ey and the magnetic field and current density is given by VH = −
RH =
Ey 1 = , jx Bz qn
(2.10)
(2.4) Breakdown: Critical electric fields in insulators (a) Consider the parallel plate capacitor: two flat metal plates separated by an electrical insulator barrier of thickness tb and dielectric constant eb . If a voltage V is applied across the metal plates, show that the electron sheet density σm at the metal/insulator interfaces is qσm = eb V tb . This result, which is a statement of Gauss’s law, can also be viewed as the electric field F = V/tb across a planar insulator layer is equal to the sheet charge density divided by the dielectric constant of the qσm insulator F = . This simple result is of very eb high importance in field-effect transistors or FETs. (b) If the critical electric field at which the insulator breaks down is Fcr , the maximum sheet density that the insulator can sustain around it is σcr = eb Fcr /q. The best electrical insulators break down at fields Fcr ∼ 10 MV/cm. Estimate the maximum charge such an insulator can control by field effect. This is a fundamental limit that shows why the carrier density in typical metals cannot be modulated by field-effect using an insulator. (c) The book Plastic Fantastic by Reich describes the unearthing of a major scientific fraud where a researcher claimed to have created a metal out of an organic semiconductor by field effect, and this organic metal became a superconductor at low temperatures. The researcher had not paid attention to the currently known limits of part (b) of this Exercise. Read up about this affair and be aware of it! (2.5) Mirror mirror on the wall Believe it or not, coating glass with metal to make a mirror was a big technological breakthrough back in the day (quite possibly marking the birth of fash-
22 Exercises ion?). In this Exercise, we answer why metals are shiny – why they reflect most of the visible light incident on them. Not surprisingly, this has to do with the conduction electrons in the metal. Here is our mode of attack: when the light wave experiences a mismatch in the refractive index, part of it is transmitted, and part reflected. So we will ask Maxwell’s equations to give us the reflection coefficient when a light beam is incident from air on a metal. If we find that this reflection coefficient is very high for visible wavelengths, we have succeeded in explaining why metals are shiny.
things go right, you may even explain the dips and wiggles in Fig. 2.12. I will roughly outline the method and trust you can finish the story:
The reflection coefficient Γr will depend on the rep fractive index e(ω ) of the metal, which in turn will depend on how the conduction electrons respond to the oscillating electric field of the light beam. This is where Drude’s free electron model of the metal – the same model that explained electrical and thermal conductivity – will help us through.
fractive index of the media. The reflectance is R = |Γr |2 , which can be found for various wavelengths; this is the y–axis of the plot. Note that all symbols have their usual meanings.
The electric field of the light beam according to Maxwell oscillates in time as E(t) = E0 eiωt , where E0 is the amplitude, and ω = 2π f is the radial frequency, f = c/λ with c the speed of light, and the wavelength λ of light is the x–axis in the plot. The reflection coefficient for light is p √ √ e0 − e ( ω ) Er p Γr = = √ , where e is the reEi e0 + e ( ω )
(a) From Maxwell’s equation ∇ × H = J + iωe(ω )E in material media, show that the dielectric constant σ(ω ) of the metal is e(ω ) = e0 [1 − i ]. ωe0 (b) Now if you have the frequency-dependent conductivity σ(ω ), you can make your plot by looking up the properties of the metal! The DC Drude model for conductivity was obtained in this chapnq2 τ ter as σ(0) = me . Here you need to use Newton’s laws again for the AC electric field in the form me v −qE0 eiωt = me dv dt − τ and solve to show the following:
σ(ω ) = Fig. 2.12 Reflectance spectra of three metals (adapted from Wikipedia).
Fig. 2.12 shows the measured reflectance of three common metals as a function of the wavelength of light incident on it. Your job in this Exercise is to calculate and make your own plot by standing on Maxwell, Drude, and Newton’s shoulders. If
σ0 σ0 ωτσ0 = +i . 1 − iωτ 1 + (ωτ )2 1 + (ωτ )2 | {z } | {z } Re(σ(ω ))
Im(σ(ω ))
(2.11) (c) Now you are close to the finish line. Use the above modified Drude AC conductivity, look up the required properties of the three metals, and plot the reflectances of all the three metals. Compare with Fig. 2.12 and comment on the similarities and differences.
Quantum Mechanics in a Nutshell This chapter presents a very short summary of the major ideas of quantum mechanics. By tracing the historical development of its core ideas, we learn how to treat the static and the dynamic laws of electrons and their statistical properties by the laws of quantum mechanics rather than by classical thermodynamics and Newton’s laws. In writing this chapter I thought I was kidding myself in literally trying to bottle a genie; the consequence is that this chapter is longer than the others. Quantum mechanics is one of the greatest intellectual advances in the history of science, and a single chapter can hardly do justice to the subject! I highly recommend that you append your reading of this chapter with your favorite quantum mechanics texts. I have suggested a few in the Further Reading section at the end of the chapter. The major goals of this chapter are to understand the following concepts:
• What experimental facts led to the development of quantum mechanics? • What are the core ideas of quantum mechanics, and how are they different from classical mechanics? • How did quantum mechanics lead to the explanation of the existence of atoms, molecules, and the entire periodic table? • How did quantum mechanics forever change the core concepts of classical thermodynamics through the Fermi–Dirac and Bose–Einstein statistics?
3 3.1 Planck’s photon energy quanta
23
3.2 Bohr’s electron energy quanta
26
3.3 Wave-particle duality
28
3.4 The wavefunction
29
3.5 Operators
31
3.6 States of definite momentum and location 33 3.7 States of definite energy: ¨ Schrodinger equation
The 35
¨ 3.8 Time-dependent Schrodinger equation 36 3.9 Stationary states and time evolution 37 3.10 Quantum current
39
3.11 Fermions and bosons
41
3.12 Fermion and boson statistics
44
3.13 The Spin-statistics theorem
46
3.14 The Dirac equation and the birth of particles 47 3.15 Chapter summary section
48
Further reading
49
Exercises
50
3.1 Planck’s photon energy quanta Time: end of the 19th century. Maxwell’s equations have established Faraday’s hunch that light is an electromagnetic wave. However, experimental evidence mounted pointing towards the fact that a beam of light is composed of particles that pack a definite momentum and energy. Here is the crux of the problem: consider the double-slit experiment shown in Fig. 3.3. Monochromatic light of wavelength λ passing through two slits separated by a distance d ∼ λ forms a diffraction pattern on a photographic plate. If one tunes down the intensity of light in a double-slit experiment to a very low value, one does not get a ”dimmer” interference pattern, but discrete strikes on the pho-
Fig. 3.1 Michael Faraday, one of the greatest experimentalists of all times. He discovered the relation between electric and magnetic fields, and influenced Maxwell to discover that light is an electromagnetic wave. Light played the central role in the development of quantum mechanics.
24 Quantum Mechanics in a Nutshell
Fig. 3.3 Photons behaving as particles.
Fig. 3.2 The failure of classical physics to explain the experimental blackbody radiation spectrum in the high-energy regions (referred to as the ultraviolet catastrophe), and Planck’s spectacular resolution of the catastrophe using the quantum hypothesis that the amount of energy of light in any mode ω is quantized and can exist only in packets of E = n¯hω where n is an integer.
tographic plate and illumination at specific points. This means light is composed of ”particles” whose energy and momentum are concentrated in one point which leads to discrete hits. So here is surprise 1: Though light is a wave, its constituent packet is swallowed whole by just one detector! Surprise 2 is the following: If the above experiment is repeated billions of times, and the count of each individual detector added up to create a histogram, the histogram becomes the classical diffraction pattern! This curious phenomenon means that though the energy and momentum of a packet of light is concentrated at a point and behaves as a particle, its wavelength extends over all space, which leads to diffraction patterns in the large number limit. This experiment, routinely done today, was not possible in the end of the 19th century because experimental sources that could produce single light quanta (or photons) did not exist at that time. However, their existence was forced upon us as the only possible explanation of the blackbody radiation pattern: a well-known experimental fact for which classical physics could offer no explanation. If we measure the intensity of light radiated from a blackbody at a temperature T (for example, a cast-iron oven with a small opening in it), the radiation spectrum has a characteristic bell-shape shown as the experimental curve in Fig. 3.2. The spectral density u(ω ) (or energy content) decreases at high frequencies ω (if it did not, we would not exist!). Rayleigh and Jeans had attempted to explain the spectrum. In Exercise 3.2, we will find that the number of modes (or the density of states) of light waves that fit in a box increases with frequency as ω 2 . To each of these allowed modes of light, classical thermodynamics allocates ∼ k b T of
3.1 Planck’s photon energy quanta 25
energy. This principle, derived in Chapter 4 is called the equipartition of energy. These considerations led Rayleigh and Jeans to the result u(ω ) ∼ ω 2 (k b T ) for the energy content with frequency at a temperature T. Now this spectra diverges as ω 2 and blows up at high frequencies – the ultraviolet catastrophe. Clearly inconsistent with experiments, it remained one of the major mysteries of the time. Max Planck (Fig. 3.4), after considerable internal struggle, postulated that in any allowed frequency mode ω, light energy can only exist in quanta of E = n(h¯ ω ) , where n = 0, 1, 2, ... is an integer, and can−34
h not assume a continuous range of energies. Here h¯ = 2π = 6.63×2π10 J·s. h is the Planck’s constant, and h¯ is the reduced Planck’s constant. In Fig. 3.2, Planck’s quantum hypothesis is indicated schematically as the lines E = n¯hω crossing the origin. It is clear from this hypothesis that if energy E ≤ k b T is available from heat as indicated by the horizontal line, the number of light quanta in any mode ω decreases with increasing h¯ ω. For example, in the two modes shown, ω1 has n = 0, 1, 2, 3, 4, 5 allowed, and ω2 has n = 0, 1, 2, 3. It is clear that higher energy modes have less number of quanta occupied at a given energy because of the discrete number allowed for each mode. This postulate led Planck to derive a spectral density of radiation
Fig. 3.4 Max Planck is considered to be the ”father” of quantum mechanics. Postulated quanta of light (photons) to explain the blackbody radiation spectrum. He was forced to put forward this hypothesis because he discovered that it explained experimental data, in spite of the conflict it caused to him as he was a classically trained physicist where such quanta was inconcievable. Planck was awarded the Nobel prize in physics in 1918.
h¯ ω
given by u(ω ) ∼ ω 3 /(e kb T − 1 ). His formula exactly explained experimental data, resolving the ultraviolet catastrophe. In Exercise 3.2, I have pointed out steps for you to work this out for yourself, reliving Planck’s struggle and the birth of quantum mechanics. Planck’s key postulates were that photons in the mode of circular frequency ω can only have discrete energies E = n¯hω, and each photon in that mode has a momemtum p = h¯ k. Here k = (2π/λ)n, ˆ nˆ is a unit vector in the direction of propagation, and ω = c|k| where c the speed of light. Planck’s blackbody radiation formula has been experimentally confirmed with spectacular accuracy till today. It enables measurement of temperatures from microscopic to cosmic scales (see Exercise 3.11). Just about five years after Planck’s result, Einstein (Fig. 3.5) used it to explain the phenomena of ejection of electrons from metals by light known as the photoelectric effect, which is discussed further in Exercise 3.3. To explain the photoelectric effect, Einstein used two core concepts: (a) that the energy in any mode of light is quantized to n(h¯ ω ) as postulated by Planck, and (b) invoked the recently discovered particle, the electron. Einstein was developing the theory of relativity around the same time. In this theory, p the momentum of a particle of mass m and velocity v is p = mv/ 1 − (v/c)2 , where c is the speed of light. Thus if a particle has m = 0, the only way it can pack a momentum is if its velocity is v = c. Nature takes advantage of this possibility and gives us such particles. They are now called photons. Thus photons have no mass, but have momentum. Thus light, which was thought to be a wave acquired a certain degree of particle attributes. So what about particles with mass – do they have wave nature too? Nature is too beautiful to ignore this symmetry!
Fig. 3.5 Albert Einstein is considered the greatest physicist since Newton. In addition to special and general relativity, he contributed significantly to the development of quantum mechanics, and all areas of physics. He was awarded the Nobel Prize in 1921 for the photoelectric effect, an early experiment confirming quantum theory. He did not have a brother, and certainly did not start a bagel company.
26 Quantum Mechanics in a Nutshell
The interaction of light with matter (i.e. particles with mass) of its purest form – that is – with pure gases of the elements such as hydrogen resulted in spectra that was nothing like the blackbody spectrum seen in Fig. 3.2. Instead, the spectrum that was emitted by pure hydrogen was at a set of very peculiar and sharp frequencies. These frequencies were identified in 1880s by Rydberg to follow the empirical relationship 1 1 h¯ ω = (13.6 (3.1) | {zeV})( n2 − n2 ), 2 1 Ry
Fig. 3.6 The spectrum of pure Hydrogen is a set of sharp lines, in contrast to the blackbody radiation. The lines were precisely at energies given by the empirical Rydberg formula for which there was no explanation.
where n1 and n2 are integers, as indicated in Fig. 3.6. The spectral lines were identified in sets with n1 = 1 and n2 = 2, 3, ... in one set, n1 = 2, and n2 = 3, 4, 5, ... in another, and so on. Though this strange empirical pattern was measured for several pure atomic species and the energy unit Ry = 13.6 eV came to be known as the Rydberg, the reason behind these sharp spectral lines remained a complete mystery. The solution to this mystery came from an unexpected corner; it had to wait for nearly three decades for the discovery of the heart of the atom itself.
3.2 Bohr’s electron energy quanta
Fig. 3.7 Ernest Rutherford, the discoverer of the nucleus and the proton, is considered the father of nuclear physics. He was awarded the Nobel Prize in Chemistry in 1908.
Fig. 3.8 Niels Bohr was the original architect and ”conscience keeper” of quantum mechanics, and an intellectual leader who influenced an entire generation. He was awarded the Nobel Prize in 1922. His own theory certainly passed his own famous test: ”We agree your theory is crazy. But it is not crazy enough to be true.”
In 1911, Ernest Rutherford (Fig. 3.7) discovered the nucleus of the atom: a very dense and small region in an almost empty atom that consisted of exactly the same net charge, but of sign opposite to that of the electrons. This discovery opened a literal pandora’s box of questions regarding the stability of the atom. The simplest atom – hydrogen – has one electron and one proton in the nucleus. If these particles followed the laws of classical mechanics, the electron should instantly crash into the proton due to the immense Coulomb attraction force between them. In classical mechanics and classical electromagnetism, Earnshaw’s theorem asserts that a collection of point charges cannot be in stable equilibrium due to electrostatic forces alone. So according to classical mechanics and electromagnetism, the hydrogen atom, or any other atom is unstable, and should not exist! But we know from experiments that the stability of atoms is their hallmark: they are eternally stable objects. Immense energies of the scale of E = mc2 are needed to break them apart. This conundrum was resolved by Niels Bohr (Fig. 3.8) with an explanation that shook classical mechanics to its core. It is important to go through his arguments to see how he resolves the conundrum, and for the first time is able to explain the origin of the sharp Rydberg spectral lines. His arguments go as follows: the light beam incident on the atoms primarily interacts with the electrons as they are much lighter than the nucleus. The total energy E of the electron of mass me and nonrelativistic momentum p = me v a distance R from the proton in a hydrogen atom is the sum of the kinetic energy and the potential energy due to attraction to the nucleus:
3.2 Bohr’s electron energy quanta 27
E=
p2 q2 − . 2me 4πe0 R
(3.2)
Now the classical values of the electron energy are continuous because R and p can assume any values. So the electron making a transition from a high energy state to a low energy state should emit a continuous spectra of photons, not the sharp discrete lines of Fig. 3.6. Because the experimental spectrum disagrees with classical energies, the classical concept of all possible energies of Equation 3.2 must be incorrect. After several trials and errors, and specifically going back to the existing evidence of the spectral lines, Bohr found a radical solution. His singular hypothesis that solved the entire problem is this: the momentum of the electron p, and the path (or orbit) it takes around the proton in the hydrogen atom cannot assume the continuous values that are allowed by classical mechanics, but can only take values such that they satisfy the condition I pdx = nh,
(3.3)
H where h is Planck’s constant, and n = 1, 2, 3, ... is an integer, and ... is a closed loop integral over the circumference. If this is true, and we assume a circular orbit of radius R (Fig. 3.9), we must have I
pdx = pn 2πRn = nh =⇒ pn Rn = n¯h.
(3.4)
Balancing the classical attractive Coulomb force with the classical centrifugal force we obtain that the electron can only assume discrete momentum values: me v2n q2 q2 m e 1 h¯ 1 = =⇒ p = · = · , n Rn 4πe0 h¯ n aB n 4πe0 R2n
Fig. 3.9 A classical picture of the Hydrogen atom.
(3.5)
and only orbits of fixed radius given by R n = n2 (
4πe0 h¯ 2 ) = n2 a B , q2 m e | {z }
(3.6)
aB
2
¯ 0h where a B = 4πe is called the Bohr radius, and has a value of roughly q2 m e ˚ 0.5 A. Note that the picture is a set of allowed orbits of radii Rn in which the electron can move around the nucleus, just as the planets around the Sun in the Solar System1 . Now going back to Equation 3.2 with these quantized momenta pn and radii Rn , Bohr obtained the discrete energies
En =
p2n q2 1 h¯ 2 1 h¯ 2 1 h¯ 2 1 − = − =− 2 2 2 2 2me 4πe0 Rn 2 me a B n me a B n 2me a2B n2
(3.7)
1 The theory is also called the ”Bohr-
Orbit” theory as a throwback to Newton’s explanation of planetary orbits.
28 Quantum Mechanics in a Nutshell
allowed for the electron. Note that the kinetic energy is half in magnitude of the potential energy, but the potential energy is negative due to attraction. Writing the allowed discrete energies as En = −
m e q4 2(4πe0
)2 h¯ 2
1 , n2
(3.8)
Bohr found the difference between two energy states as En2 − En1 =
m e q4 2(4πe0 | {z
)2 h¯ 2
Ry=13.6eV
Fig. 3.10 de Broglie proposed in his PhD thesis that particles with mass have wavelengths associated with their motion. He was awarded the Nobel Prize in Physics in 1929.
}
(
1 1 − 2 ), 2 n1 n2
(3.9)
which is exactly equal to the experimentally observed spectral lines of Hydrogen! The Rydberg energy was thus discovered to be this unique combination of the electron mass me , its charge q, and Planck’s constant h¯ . Bohr asserted that the reason for the stability of the atom is that the classical theory of mechanics that was used to argue that the atom should be unstable is simply incorrect for the electron. The mechanics of the electron follow quantum rules, and the question of instability simply does not arise because the discrete orbits are stable. Exactly why they are stable (i.e., stationary states) was proven mathematically much later, and the exact quantization condition also ran into experimental difficulties in explaining the spectrum of helium and other gases. But Bohr had, in one stroke brought quantum theory to particles, or electrons.
3.3 Wave-particle duality Fig. 3.11 The Davisson–Germer experiment that for the first time showed that electrons which are particles diffract from the surface of a nickel crystal. The periodically arranged atoms of the crystals diffract the electrons, proving conclusively their wave nature, and de Broglie’s hypothesis that the wavelength λ and momentum p for both waves and particles are related by λ = h/p.
2 The
electron wavelength is much shorter than a photon for the same energy as a photon. This is because an electron has a substantial momentum due to its mass compared to massless photons.
One would think that Bohr, and Sommerfeld must have concluded H from their own quantization condition pdx = nh that particles have wave nature because if for theH electron p = h/λ, then one can immediately conclude that nλ = dx = 2πR, meaning an allowed electron orbit is such that an integer number of electron wavelengths fit in any closed path length, not necessarily only the circular one. But such was the power of classical mechanics that it was difficult for even Bohr to accept that the relation p = h/λ, which was already found by Planck for photons or waves, could also apply to electrons. This deep connection – that there is a wavelength associated with particles with mass took another decade to be figured out. In 1924, de Broglie (Fig. 3.10) hypothesized in his PhD dissertation that classical ”particles’ with mass also have wavelengths associated with their motion. The wavelength is λ = 2π¯h/|p|, which is identical to p = h¯ k. How could it be proven? The wavelength of visible light (∼ 400 − 700 nm) or photons is such that diffraction gratings (or slits) were available at that time. But electron wavelengths are much shorter2 . Where might one find a diffrac-
3.4
The wavefunction 29
Fig. 3.12 Electrons behaving as waves. Electrons incident on a semiconductor crystal surface undergo diffraction from the periodic grating of atoms, producing characteristic patterns on a screen. ”Reflection high-energy electron diffraction” (RHEED) is a technique used routinely today for peering at the surface of a semiconductor crystal as it grows during molecular beam epitaxy (MBE). The technique a direct relic of the DavissonGermer experiment that proves that electrons are waves.
tion grating that small? Max von Laue and co-workers in Munich had experimentally diffracted off X-rays from crystals, and discovered that the atoms in crystals are arranged periodically in space. In other words, a crystal is a natural spectrometer for X-rays, which are photons ˚ where of wavelength of the order of interatomic distances of a few A, − 10 ˚ 1 A=10 m. Crystals played a major role in the discovery of the wave-particle duality (and will play a major role in this book!). Elsassaer proposed using a crystal as a diffraction grating not for photons, but for electrons. A few years after de Broglie’s hypothesis (and unaware of it!), Davisson and Germer at the Bell Laboratories were shooting electrons in a vacuum chamber on the surface of crystalline nickel (Fig. 3.11)3 . They observed diffraction patterns of electrons, which was mysterious to them, but as soon as they learnt of de Broglie’s hypothesis, they realized that they had unknowingly proven his hypothesis to be quantitatively correct. All particles had now acquired a wavelength! Fig. 3.12 completes the duality between the corresponding experiment with light (photons) that was discussed in Fig. 3.3. The Davisson–Germer experiment challenged the understanding of the motion or ”mechanics” of particles, which was based on Newton’s classical mechanics. In classical mechanics, the central question in every problem is the following: a particle of mass m has location x and momentum p now. If a force F acts on it, what are ( x 0 , p0 ) later? New2 ton’s law F = m ddt2x gives us the answer. The answer is deterministic, the particle’s future fate is completely determined from its present, and the force that acts on it. This is no longer correct if the particle has wave-like nature. The wave-particle duality is the central fabric of quantum mechanics. It leads to the idea of a wavefunction (Fig. 3.13).
3.4 The wavefunction If a particle has a wavelength λ, what is its location x? A wave is an extended quantity. If a measurement of the particle’s location is per-
3 The story goes that there was an acci-
dent and a leak developed in the vacuum apparatus sketched above. Because of the leak the surface of the nickel oxidized and lost its shine. They heated and melted the metal, and upon cooling it re-crystallized, and to their surprise, the electron diffraction spots showed up!
Fig. 3.14 Joseph Fourier, the French mathematician and physicist introduced the powerful concept of Fourier series, which expresses any function as a sum of oscillating functions. Fourier’s ideas are central to the development of quantum mechanics that is consistent with the wave-particle duality. Fourier was a good friend of Napoleon.
30 Quantum Mechanics in a Nutshell
Fig. 3.13 Birth of the wavefunction to account for the wave-particle duality.
Fig. 3.15 Max Born introduced the probabilistic representation of the quantum wavefunction. With his student Heisenberg, he discovered matrix mechanics. He influenced and guided several young scientists who made contributions to quantum theory. Born was awarded the Nobel Prize in 1954.
formed, it may materialize at location x0 . But repeated measurements of the same state will yield an average value h x i = x0 + ∆x. Separate measurements of the momentum of the particle prepared in the same state will yield the average value h pi = p0 + ∆p. The ”uncertainty” relation states that ∆x∆p ≥ h¯ /2. This is a strictly mathematical consequence of representing a particle by a wave. Because the numbers ( x, p) of a particle cannot be determined with infinite accuracy simultaneously, one has to let go of the classical picture. How must one then capture the mechanics of a particle? Any mathematical structure used to represent the particle’s state must contain information about its location x and its momentum p, since they are forever intertwined by the wave-particle duality. One is then forced to use a function, not a number. This function of a quantum object is denoted by ψ, and is called the wavefunction. A first attempt at constructing a wavefunction for a quantum particle with a momentum p is ψ( x ) = A cos( px/¯h). This guess is borrowed from the classical representation of waves in electromagnetism, and in fluid dynamics. This wavefunction can potentially represent a quantum particle of a definite momentum p if it satisfies the rule of wave-particle duality. Max Born (Fig. 3.15) provided the statistical interpretation of the wavefunction by demanding that |ψ|2 be the probR 2 ability density, and |ψ| dx = 1. In this interpretation, |ψ( x )|2 dx is the probability that a measurement of the particle’s location will find the particle in the location ( x, x + dx ). |ψ|2 is analogous to the intensity of an electromagnetic wave, and ψ to the electric field amplitude.
3.5
It is clear that |ψ( x )|2 = | A|2 cos2 ( px/¯h) assigns specific probabilities of the location of the particle, going to zero at certain points on the x-axis. Since the momentum p is definite, ∆p = 0, and the uncertainty relation ∆x∆p ≥ 2h¯ requires that ∆x → ∞, meaning the location of the particle must be equally probable at all points in space. Thus we px reject the attempted wavefunction ψ( x ) = A cos( h¯ ), because it is inconsistent with the uncertainty principle: it violates the wave-particle duality. The simplest wavefunction consistent with the wave-particle duality picture is ψ p ( x ) = Aeipx/¯h , where A is a complex number constant. The complex exponential respects the wave-nature of the particle by providing a periodic variation in x, yet its probability density never goes to zero: the probability (density) is |ψ p ( x )|2 = | A|2 , equal at all x. Thus, complex numbers are inevitable in the construction of the wavefunction representing a particle. In Fig. 3.17 it is shown that the best we can do to represent a particle at a particular location on the x-axis while respecting its wave nature is to create a wavepacket by creating a linear combination of a bunch of momentum values: ψ( x ) = px ∑ p A p ei h¯ , where A p is the amplitude of each momentum component. If we try to localize the quantum particle tightly in real space x, ∆x is small, but this wavepacket requires a larger number of p weights in the momentum space, and ∆p becomes large. All linear superpositions of the form shown here are allowed in quantum mechanics. All the physical information of the quantum state is contained in the wavefunction ψ( x ) in real space, or equivalently, its representation in the momentum space, which is the set of coefficients (..., A p1 , A p2 , ...). To extract any physical information from the wavefunction, we must apply the corresponding physical operators on the wavefunction.
Operators 31
Fig. 3.16 Werner Heisenberg discovered matrix mechanics with Max Born, and is an original founder of quantum theory. He was awarded the Nobel Prize in 1932. He is better known in pop media for his uncertainty principle.
3.5 Operators Every physical observable in quantum mechanics is represented by an operator. When the operator ”acts” on the wavefunction of the particle, it extracts the value of the observable. For example, the momentum operator is pˆ = −i¯h∂/∂x, and for states of definite momentum ˆ p ( x ) = (h¯ k )ψ p ( x ). We note that ( x pˆ − px ˆ ) f ( x ) = i¯h f ( x ) for any pψ function f ( x ). The presence of the function in this equation is superfluous, and thus one gets the identity ˆ = [ x, pˆ ] = i¯h. x pˆ − px
(3.10)
The square brackets define a commutation relation. The space and momentum operators do not commute4 . This sort of non-commuting mathematical structure of x and p was first uncovered by Heisenberg5 . He was uncomfortable with the Bohr orbit picture, as it offered no explanation of why the coordinates ( x, p) should follow an arbitrary quantization condition. Since experimentally electron orbits were not observed, but only the sharp spectral lines, Heisenberg was trying to
4 Objects a and b commute if ab = ba and
do not commute if ab 6= ba. Clearly if a and b are real or complex numbers, they commute. But if they are matrices, or if one or both are differential opd erators such as a = dx and b = x2 , they do not necessarily commute, since d 2 2 d dx ( x f ( x )) 6 = x dx f ( x ) for an arbitrary function f ( x ).
5 ”It was about three o’ clock at night
when the final result of the calculation lay before me. At first I was deeply shaken. I was so excited that I could not think of sleep. So I left the house and awaited the sunrise on the top of a rock.” – Heisenberg.
32 Quantum Mechanics in a Nutshell
Fig. 3.17 The superposition principle allows us to create wavefunctions that can represent ”wave-like” or ”particle-like” states with equal ease. Wave-like states have large ∆x and small ∆p, and particle-like states have small ∆x and large ∆p. All the while, they satisfy the uncertainty principle ∆x∆p ≥ h¯ /2.
construct a mathematically rigorous basis for the spectral lines. What he realized was that if x and p were not numbers, but objects that did not commute, then the differences between allowed energy levels could be discrete and explain the spectral lines. His mentor Max Born made the connection immediately that what Heisenberg had uncovered was that x and p are not numbers but matrices. This is because matrices do not necessarily commute. The first mathematically consistent formulation of quantum mechanics, due to Heisenberg, is thus called matrix mechanics. Matrices can be loosely considered to be ”scars” of numbers, meaning the sharp numbers x and p are smeared out into a set of numbers arranged in the matrices [ x ] and [ p]. In classical mechanics, xp − px = [ x, p] = 0. Quantum mechanics elevates the ”status” of x and p to those of mathematical operators (or equivalently, matrices), preventing them from commuting. This is referred to as the ”first quantization” from classical to quantum mechanics. In this scheme, the dynamical variables ( x, p) that were scalars in classical mechanics are promoted to operators, and the wavefunction ψ is a scalar. If the number of particles is not conserved, then one needs to go one step further, and elevate the status of the wavefunction ψ → ψˆ too, which is called ”second quantization”.
3.6 States of definite momentum and location 33
Fig. 3.18 Quantum mechanics of the particle on a ring as an illustrative example of the effects of quantum confinement.
3.6 States of definite momentum and location The wavefunction ψ p ( x ) = Aeipx/¯h is a state of definite momentum ˆ p ( x ) = pψ p ( x ). since it is an eigenstate of the momentum operator pψ As a general rule, when an operator Oˆ of a physically observable quantity O acts on a quantum state ψO that is in a state of definite O, it ˆ O = OψO . ψO is extracts the number O from the wavefunction: Oψ ˆ then called an eigenfunction of operator O, and the number O is the eigenvalue. In the example discussed in this paragraph, O = p is ∂ the eigenvalue of Oˆ = pˆ = −i¯h ∂x , and Aeipx/¯h is the corresponding 6 eigenfunction . One may demand the location of the particle to be limited to a finite length L as indicated in Fig. 3.18. This may be achieved by putting an electron on a ring of circumference L. A trial wavefunction for this particle is ψ( x ) = Ae√ikx , which yields upon normalization RL 2 0 | ψ ( x )| dx = 1 =⇒ A = 1/ L. The wavefunction must also satisfy the relation ψ p ( x + L) = ψ p ( x ) to be single-valued. This leads to eikL = 1 = ei2π ×n , and k n = n × (2π/L). Here n = 0, ±1, ±2, ... The linear momentum of the particle is then quantized, allowing only discrete values. Since L = 2πR where R is the radius of the ring, k n L = 2πn → pR = n¯h, showing that the angular momentum is quantized to values7 0, ±h¯ , ±2¯h, .... This indeed is the quantum of quantum mechanics! One may then index the wavefunctions of definite linear momentum by writing ψn ( x ). Expressing states of definite momentum in terms of states of definite location similarly yields 1 ψn ( x ) = √ eikn x . L
(3.11)
The set of wave functions [...ψ−2 ( x ), ψ−1 ( x ), ψ0 ( x ), ψ1 ( x ), ψ2 ( x ), ...] = RL ? ( x ) ψ ( x ) = δ , i.e., the [ψn ( x )] are special. We note that 0 dxψm n nm functions are orthogonal, and δnm is the Kronecker–delta. Any general wavefunction representing the particle ψ( x ) can be expressed as
6 A possible confusion to avoid because
of the limited symbols in English language here: the operator Oˆ and number O should not be confused with the number zero: 0.
7 The angular momentum is a vector L =
r × p analogous to classical mechanics.
34 Quantum Mechanics in a Nutshell
Fig. 3.19 States of definite location and states of definite momentum.
8 Fig. 3.19 shows states of definite loca-
tion and momentum. A general quantum state ψ( x ) can always be written as a linear superposition of states of definite location, or as a linear superposition of states of definite momenta.
a linear combination of this set. This is the principle of superposition, and a basic mathematical result from Fourier’s (Fig. 3.14) theory. Thus the quantum mechanical state of aRparticle may be represented as ψ( x ) = ∑n An ψn ( x ). Clearly, An = dxψn? ( x )ψ( x ). Every wavefunction constructed in this fashion represents a permitted state of the particle8 , as long as ∑n | An |2 = 1. It is useful here to draw an analogy to the decomposition of a vector into specific coordinates. In Fig. 3.20, the ”hybrid” state function ψ( x ) is pictured as a vector |ψi in an abstract space. The definite momentum wavefunctions ψn ( x ) are pictured as the ”coordinate” vectors |ni in that space of vectors. This set of vectors is called the basis. Since there are an infinite set of integers n = 0, ±1, ±2, ..., the vector space is infinite dimensional. It is called the Hilbert space. One may then consider the coefficients An as the length of the projections of the state on the basis states. The abstract picture allows great economy of expression by writing |ψi = ∑n An |ni. The orthogonality of the basis states is hm|ni = δmn , and thus An = hn|ψi. Then it is evident that |ψi = ∑n hn|ψi|ni = ∑n |nihn|ψi, and ∑n |nihn| = 1. A vector may be decomposed in various basis coordinates. For example, a vector in 3D real space may be decomposed into cartesian, spherical, or cylindrical coordinate systems. Similarly, the choice of basis states of definite momentum is not unique. The wavefunctions for states of definite location are those functions that satisfy xψx0 ( x ) = x0 ψx0 ( x ), which lets us identify ψx0 ( x ) = δ( x − x0 ). Here δ(...) is the Dirac–delta function, sharply peaked at x = x0 (Fig. 3.19). It is instructive to expand the states of definite location in the basis of the states of definite momentum. From the uncertainty relation, we expect a state of definite yields √ momenta. The expansion √ R +∞location to contain many An = −∞ dx/(2π/L) × (eikn x / L)δ( x − x0 ) = eikn x0 / L, whereby | An |2 = 1/L. Thus, the state of definite location x0 is constructed of an infinite number of states of definite momentum n = 0, ±1, ±2, ..., each with equal probability 1/L.
3.7 States of definite energy: The Schr¨odinger equation 35
Fig. 3.20 Vector spaces for quantum states: we can use results of linear algebra for quantum mechanics problems.
¨ 3.7 States of definite energy: The Schrodinger equation States of definite energy ψE ( x ) are special. Unlike the states of definite momentum or definite location, we cannot write down their general wavefunction without additional information. That is because the energy of a particle depends on its potential and kinetic components. p2 In classical mechanics, the total energy is 2m + V ( x ), i.e., it is split between kinetic and potential energy components. Once x and p are known for a classical particle, the energy is completely defined. However, since x and p cannot be simultaneously defined for a quantummechanical particle with arbitrary accuracy, the energy must be obtained through operations performed on the wavefunction. Schrodinger (Fig. 3.21) provided the recipe to find the states of def¨ inite energy, and the equation is thus identified with his name. The ¨ Schrodinger equation is " # h¯ 2 ∂2 − + V ( x ) ψE ( x ) = EψE ( x ). (3.12) 2m ∂x2 The solution of this eigenvalue equation for a potential V ( x ) iden-
Fig. 3.21 Erwin Schrodinger introduced ¨ the wave equation for quantum mechanics. He was awarded the Nobel Prize in 1933.
36 Quantum Mechanics in a Nutshell
tifies the special wavefunctions ψE ( x ). These wavefunctions represent states of definite energy. How do we ascertain the accuracy of the Schrodinger equation? The answer: through experiments. ¨ As discussed earlier, Niels Bohr had used a heuristic model to explain the spectral lines that lacked mathematical rigor. The triumph of the Schrodinger equation was in explaining the precise mathemat¨ ical structure of the electron states, and the resulting spectral lines. An electron orbiting a proton in a hydrogen atom sees the potential q2 V (r ) = − 4πe0 r . Schrodinger solved this equation (with help from a ¨ mathematician), and obtained energy eigenvalues En = − 13.6 eV. Thus n2 Bohr’s semi-qualitative model was given a rigid mathematical basis by Schrodinger’s equation. The equation also laid down the recipe for ¨ solving similar problems in most other situations we encounter. Just as the case for states of definite momentum or definite location, one may expand any state of a quantum particle in terms of the states of definite energy ψ( x ) = ∑ E A E ψE ( x ), or equivalently |ψi = ∑ E A E | Ei So why do states of definite energy occupy a special position in applied quantum mechanics? That is because they also mathematically explain why the electron in the allowed energy states of any atom is stable, that is, it does not change in time. That becomes clear if we consider the time-dependent Schrodinger equation. ¨
¨ 3.8 Time-dependent Schrodinger equation Newton’s law F = dp/dt provides the prescription for determining the future ( x 0 , p0 ) of a particle given its present ( x, p). Schrodinger ¨ provided the quantum-mechanical equivalent (Fig. 3.22), through the time-dependent equation i¯h
∂Ψ( x, t) h¯ 2 ∂2 = [− + V ( x )]Ψ( x, t). 2 ∂t | 2m ∂x{z }
(3.13)
Hˆ
9 Here χ˙ (t) =
dχ(t) dt .
To track the time-evolution of quantum states, one must solve this equation and obtain the composite space-time wavefunction Ψ( x, t). Then physical observables can be obtained by operating upon the wavefunction by the suitable operators. Let’s look at a particular set of solution wavefunctions that allow the separation of the time and space variables, of the form Ψ( x, t) = χ(t)ψ( x ). Inserting this back into the time-dependent Schrodinger equation and rearranging, we obtain9 ¨ i¯h χ˙ (t)
ˆ (x) χ˙ (t) Hψ = = E. χ(t) ψ( x )
(3.14) ˆ (x) Hψ
Now since i¯h χ(t) does not depend on space, and ψ( x) does not depend on time, yet both are equal at all space and time, both must
3.9 Stationary states and time evolution 37
Fig. 3.22 The dynamics of quantum states is governed by the time-dependent Schrodinger equation. Note that it looks like a hybrid ¨ of the classical energy and a wave equation, which is how it must be to account for the wave-particle duality.
be equal to a constant. The constant is called E, and clearly has dimensions of energy in joules. The right half of the equation lets us ˆ E ( x ) = EψE ( x ) are states of definite energy. Then the identify that Hψ left side dictates that the time dependence of these states is described by χ(t) = χ(0)e−iEt/¯h . Thus the particular set of solutions E
Ψ E ( x, t) = ψE ( x )e−i h¯ t
(3.15)
now define the time evolution of the states. Here ψE ( x ) are states of definite energy, as obtained by solving Equation 3.12, the timeindependent Schrodinger equation10 . ¨
3.9 Stationary states and time evolution We note that |Ψ E ( x, t)|2 = |ψE ( x )|2 , that is, the probability of finding the particle at any x does not change with time. That means that a particle prepared in a state of definite energy will stay in that energy if there are no perturbations to V ( x ). Its wavefunction does evolve in time as exp (−iEt/¯h), but this evolution is ”unitary” since its absolute value is unity. Notice the analogy with Newton’s first law, which states that a particle at rest or moving with constant velocity will continue to do so unless acted upon by a force. The states of definite energy are therefore special since they do not evolve with time unless perturbed, and are called stationary states. They finally explain why the quantum mechanical states of electrons in atoms are stable. They are not really
10 An aspect of slowly varying, or adia-
batic time evolution was missed by the original founders of quantum mechanics and only uncovered in the 1980’s. These are embodied in the Berry phase, which is discussed in later chapters.
38 Quantum Mechanics in a Nutshell
moving or orbiting in the classical sense, they are just there, spread in space as one diffuse object, with probability density |Ψ E ( x, t)|2 = |ψE ( x )|2 . Any general allowed quantum state (not necessarily states of definite energy) may then be written as the linear superposition Ψ( x, t) =
E
∑ AE ΨE (x, t) = ∑ AE ψE (x)e−i h¯ t . E
(3.16)
E
The states of definite energy form a convenient and often-used basis for expansion of general states of a particle. That is because they are stationary states – it is simpler if the basis states are fixed. Consider the case where a hybrid state Ψ( x, t) is prepared with components in two states | E1 i and | E2 i. Then, the expansion is Ψ( x, t) = A E1 ψE1 ( x )e−iE1 t/¯h + A E2 ψE2 ( x )e−iE2 t/¯h . The probability density of this superposition state is
|Ψ( x, t)|2 = | A E1 ψE1 ( x )e−iE1 t/¯h + A E2 ψE2 ( x )e−iE2 t/¯h |2 .
(3.17)
This is the single most important property of quantum states: we must first add the amplitudes of the wavefunction, and then square to find the probability density of superposed states. In other words, we add the amplitudes, not the intensities of the waves. Doing so yields:
|Ψ( x, t)|2 = | A E1 |2 |ψE1 ( x )|2 + | A E2 |2 |ψE2 ( x )|2 + E1 − E2 A E1 A E2 ψE1 ( x )ψE2 ( x ) cos t , h¯
(3.18) (3.19)
which oscillates with time with frequency ω12 = ( E1 − E2 )/¯h. Such two-level systems are being currently explored for making quantumbits or qubits for a form of analog computation called quantum computation. If we had first squared and then added the intensities of the two states, their phase information would be lost, there will be no oscillations, and we will be back in the realm of classical mechanics. All transport and optical phenomena involve time evolution. So most of the time in semiconductor physics we will work with the solutions of the time-dependent Schrodinger equation. The states of def¨ inite energy as a function of momentum E(k) that form the energy bandstructure of the solid thus provide a most convenient basis for the analysis of electronic and optical phenomena of semiconductors. The time evolution of the expectation value of an operator Oˆ is given by Ehrenfrest’s theorem ˆ dhOi i = − h[Oˆ , Hˆ ]i, dt h¯
(3.20)
where the operator itself is time-independent. By using Oˆ = pˆ and pˆ 2 Hˆ = 2m + V ( x ), Ehrenfrest’s theorem directly leads to Newton’s second law. It forms the starting point for the density-matrix formulation of the time-evolution of quantum states.
3.10
Quantum current 39
3.10 Quantum current In the physics of semiconductors and nanostructures, we will be deeply concerned with the flow of currents. A current is a measure of the flow of objects from one point in space to another. The flow of electric charges constitutes an electric current, leading to the notion of electrical conductivity. In this chapter we develop the recipe to understand current flow from a quantum-mechanical viewpoint. Since the physical state of particles in quantum mechanics is represented by its wavefunction Ψ( x, t), the current must be obtained from the wavefunction. Since |Ψ( x, t)|2 = Ψ? Ψ is the probability density, let’s examine how it changes with time. We obtain ∂|Ψ( x, t)|2 ∂Ψ ∂Ψ? = Ψ? + Ψ, ∂t ∂t ∂t
(3.21)
where we use the time-dependent Schrodinger equation i¯h∂Ψ/∂t = ¨ ( pˆ 2 /2m + V )Ψ and its complex conjugate −i¯h∂Ψ? /∂t = ( pˆ 2 /2m + V )Ψ? to obtain for a time-independent potential V ( x ) ∂|Ψ( x, t)|2 ( pˆ 2 /2m + V )Ψ ( pˆ 2 /2m + V )Ψ? = Ψ? +Ψ , ∂t i¯h −i¯h
(3.22)
which simplifies to
∂|Ψ( x, t)|2 1 = (Ψ? pˆ 2 Ψ − Ψ pˆ 2 Ψ? ). ∂t 2mi¯h
(3.23)
Since pˆ = −i¯h∇r , we recognize the resulting equation 1 ∂|Ψ( x, t)|2 ˆ − Ψ pΨ ˆ ?) = −∇r · (Ψ? pΨ ∂t 2m
(3.24)
as the familiar continuity equation in disguise. A continuity equation is of the form ∂ρ/∂t = −∇r · j, where ρ is the particle number density and j is the particle current density. This is illustrated in Fig. 3.23. We read off the quantum mechanical current density as j=
1 ˆ − ΨpΨ ˆ ? ). (Ψ? pΨ 2m
(3.25)
This equation provides us the required recipe for calculating the probability density flow, or current flow directly from the quantum mechanical wavefunctions of states. We make √ a few observations. If Ψ is real, j = 0. Since Ψ has dimension of 1/ Volume, the dimension of j is per unit area per second. For 3D, volume is in m3 and j is then in 1/(m2 · s). For 2D j is in 1/(m · s), and it is simply 1/s for 1D. We will use this concept of currents in greater detail in later chapters, and generalize it to charge, heat, or spin currents. We also note that d ( dt
Z
3
space
2
d r |Ψ| ) = −
Z
3
space
d r∇ · j = −
I
j · dS = 0.
(3.26)
∂ρ
Fig. 3.23 The continuity equation ∂t = −∇ · j states that the rate of increase of the particle density ρ at a point is equal to the net inflow of particles carried by the flux j. The inflow is opposite to outflow, and is given by the divergence of the vector flux field.
40 Quantum Mechanics in a Nutshell
The conversion of the integral from volume to a closed surface uses Gauss’ theorem. The value of the integral is zero because Ψ and consequently j goes to zero at infinity, and the equality must hold for all space. This equation is aRstatement of the indestructibility of the particle, which follows from space d3 r |Ψ|2 = 1. If the number of particles is not conserved, then one needs to add recombination (”annihilation”) and generation (”creation”) terms to the continuity equation. It then looks as ∂ρ/∂t = −∇ · j + ( G − R) where R and G are recombination and generation rates. We also note that in the presence of a magnetic field B = ∇ × A where A is the vector potential, the quantum-mechanical momentum operator pˆ → pˆ + qA where q is the magnitude of the electron charge. This leads to an additional term in the expression of the current density 1 qA ? ˆ − ΨpΨ ˆ ?) + j= (Ψ? pΨ Ψ Ψ. (3.27) 2m m The additional term depending on the magnetic vector potential A is needed to explain current flow in magnetic materials, magnetotransport properties, and superconductivity. It is discussed in Chapter 25. If we want to determine the electric charge current, we realize that the current flux is actually of electrons that have wavefunction Ψ for which we have calculated the probability current flux j. The charge q is dragged along by the electron. So to account for the flow of charge, the current density is simply J = qj, where q is the charge (in Coulombs) of the charge particle. If these charge particles are electrons, q = 1.6 × 10−19 C and free mass me = 9.1 × 10−31 kg. In the absence of a magnetic field, the electric current density is then given by q ˆ − ΨpΨ ˆ ? ), J= (Ψ? pΨ (3.28) 2me which is now in A/m2 for 3D, A/m for 2D, and A for 1D current flow, where A=C/s is the unit of current in Amperes. The current density is expressed in terms of the electron wavefunctions. We wish to connect the expression to classical Drude formulation of Chapter 2. Consider free electrons in 1D with periodic boundary conditions between x = (0, L). The√wavefunction for a state |k i of definite energy E(k) is Ψ E ( x, t) = (1/ L)eikx e−iE(k)t/¯h . In the QM expression for current, the time evolution portion is not affected by the momentum operator, and therefore factors to 1. It is another illustration of the virtues of working with states of definite energy. The current carried by state |ki is then obtained as J (k) = I (k) = q¯hk/me L. The current density and current are the same in 1D. The current I (k) = q¯hk/me L = qv(k)/L connects to the classical notion of current carried by a particle with velocity v(k ) = h¯ k/me traversing a distance L. Another way to picture the same current is to split it as I = q × v(k) × n, where n = 1/L is the ”volume density” of particles. So we can find the current flow due to each allowed k-state for any
3.11
Fermions and bosons 41
quantum particle. Now let f (k) be an occupation function that determines whether that k-state is occupied by a particle or not, and if it is, how many particles are sitting in it. To find the occupation function f (k ), we stumble upon one of the deepest mysteries of nature that was unearthed by quantum mechanics.
3.11 Fermions and bosons The wave-particle duality requires quantum states to be represented mathematically by wavefunctions ψ, and |ψ|2 represents the probability density. Together, these two seemingly simple requirements have a rather astonishing physical consequence for the properties of more than one quantum particle. Consider two quantum states | ai and |bi with real-space wavefunctions ψa ( x ) and ψb ( x ). What is the manyparticle wavefunction when two particles are put in the two states? Let’s label the locations of the two particles as x1 and x2 . If the two particles are distinguishable, such as an electron and a proton, then the composite wavefunction may be written as the product of the singleparticle wavefunctions (see Fig. 3.25) ψ( x1 , x2 ) = ψa ( x1 )ψb ( x2 ).
(3.29)
But if the two particles are indistinguishable, such as two electrons, the wavefunction must satisfy further requirements11 . Specifically, if we swap the locations of the two electrons x1 ↔ x2 , the physical observables of the composite state must remain the same. This requirement dictates that the probability density must satisfy P( x2 , x1 ) = P( x1 , x2 ) → |ψ( x2 , x1 )|2 = |ψ( x1 , x2 )|2 .
(3.30)
The product wavefunction of Equation 3.29 does not satisfy this requirement, because in general |ψa ( x1 )|2 |ψb ( x2 )|2 6= |ψa ( x2 )|2 |ψb ( x1 )|2 . So the simple product of the wavefunctions cannot represent indistinguishable particles. A symmetrized form, however does the job: ψ( x1 , x2 ) = ψa ( x1 )ψb ( x2 ) + ψa ( x2 )ψb ( x1 )
(3.31)
because for this composite wavefunction, ψ ( x2 , x1 ) = + ψ ( x1 , x2 )
(3.32)
and the probability density does not change upon swapping. We also note that both particles may be at the same x since ψ ( x1 , x1 ) = + ψ ( x1 , x1 )
(3.33)
is OK. Particles in nature that choose the ”+” sign in Equation 3.32 are called bosons. Multiple bosons can occupy the same quantum state. What is very interesting is that this is not the only choice! The antisymmetrized form ψ( x1 , x2 ) = ψa ( x1 )ψb ( x2 ) − ψa ( x2 )ψb ( x1 )
Fig. 3.24 Wolfgang Pauli discovered the exclusion principle for which he won the Nobel Prize in 1945. He introduced matrices for electron spin. He humorously referred to himself as the imaginary part of another notable physicist – Wolfgang Paul.
(3.34)
11 Because the total energy of two parti-
cles is the sum of the two, and the eigenfunctions evolve with time as Ψ E ( x, t) = i ψE ( x )e− h¯ Et , the combined wavefunction of a multi-particle system must be some form of a product of the single particle eigenfunctions so that the energies add in the exponent.
42 Quantum Mechanics in a Nutshell
Fig. 3.25 Indistinguishable particles suffer an identity crisis when we try constructing a wavefunction for more than one particle!
leads to
12 Most practitioners of quantum me-
chanics will tell you how much they love this negative sign!
ψ ( x2 , x1 ) = − ψ ( x1 , x2 ),
(3.35)
ψ( x1 , x1 ) = −ψ( x1 , x1 ) =⇒ ψ( x1 , x1 ) = 0.
(3.36)
which is also permitted, since the probability density remains unaltered by the negative sign12 upon swapping the particles. Particles that choose the ”-” sign are called fermions. However, an attempt to put both fermions in the same location leads to
This is the famous Pauli exclusion principle. It states the simple result that two identical fermions (e.g. electrons) cannot be in the same quantum state. The exclusion principle is responsible for the existence of the periodic table of elements, and all chemical behavior of matter. Fig. 3.26 shows how the two core concepts of quantum mechanequation, and the exclusion principle together ics: the Schrodinger ¨ explain the existence of the elements and the periodic table shown in Fig. 3.27. The structure of an atom consists of the nucleus with protons and neutrons. Electrons orbit the nucleus experiencing a net potential V (r ). The Schrodinger equation provides the recipe for finding the ¨ allowed states of definite energy for the electrons. The allowed set of definite energy states in which the electron is bound to the nucleus are referred to as orbitals, and are discrete, with corresponding discrete energy eigenvalues. For example, for a fluorine atom, the allowed electron orbitals are ψ1s (r ), ψ2s (r ), ψ2p (r )... with the corresponding energy eigenvalues E1s , E2s , E2p ... Note that the p orbital allows three eigenvalues of equal energy. The orbitals of 2s and 2p are said to be in the same shell, which is different from 1s, and 3s, 3p, and 3d. Each orbital allows electrons of two opposite spins. Because the Fluorine atom has 9 electrons, the electrons must start filling from the lowest orbital and follow the exclusion principle. Fol-
3.11
Fermions and bosons 43
Fig. 3.26 The existence of the elements in the periodic table is explained by the exclusion principle, and Schrodinger’s equation. The ¨ allowed electron states or orbitals, and energies are determined by the net potential experienced by it due to the combined influence of the nucleus and the other electrons by solving the Schrodinger equation. Filling these allowed orbitals with electrons according ¨ to the exclusion principle leads to a periodic occurrence of partially filled orbitals that make the elements chemically reactive - such as fluorine, and chemically inert, such as neon.
lowing this prescription, Fig. 3.26 shows that the 1s, 2s, and two of the 2p orbitals are completely filled, but one of the 2p orbitals has only one electron and can accept another. The next element in the periodic table is neon, which has 10 electrons. This exactly fills the 2p orbitals. This may seem like a minor detail, but the physical properties of fluorine and neon differ drastically due to this difference! If all the orbitals of a shell are filled with electrons, we call it a closed shell. Atoms with 8 electrons in the outermost shell are inert gases. For neon, shells 1 and 2 are closed, and the atom is chemically closed too, meaning this atom will not accept, or give away electrons easily: there is a very high energy cost to do so. This makes neon chemically inert, and such elements are therefore called inert or noble gases. On the other hand, if fluorine could add just one more electron to its arsenal, it could become inert. This makes a fluorine atom extremely hungry for electrons – it will yank electrons away from any element that has one extra electron in a shell. Such atoms are H, Li, Na, ... or atoms in the 1st period of the periodic table. This is the reason for chemical bonding and the formation of molecules – if a hydrogen and a fluorine atom share one electron, both of them can effectively close their open shells. The hydrogen looks like helium, and the fluorine like neon in this molecule. The total energy of the hydrogen fluoride (HF) molecule is lower than the sum of the individual energies of the H and F atoms. This chemical reactivity of elements of group VII fueled by the hunger for that extra electron makes them bond with Group I metals, producing salts such as NaCl. This is the reason Group VII gases are called halogens, which literally means ”salt-producing”. Pauli arrived at the exclusion principle in an effort to explain the sharp atomic spectra of helium, and many-electron atoms. The Bohr model, which was successful in explaining the hydrogen atom spectra, failed for atoms with two or more electrons. Enforcing the exclusion principle explained not just the spectra, but a whole host of other prop-
44 Quantum Mechanics in a Nutshell
Fig. 3.27 The periodic table of elements: all the (known) cards of the universe in one neat table!
Fig. 3.28 Dmitri Mendeleev, the discoverer of the periodic table of elements. Based on the periodic occurrence of physical properties, Mendeleev predicted the existence of the then-unkown elements Ga and Ge, which he called ekaAluminum and eka-Silicon possibly as a throwback to the ancient language of Sanskrit (eka=one), which uses periodically repeating speech sounds.
erties of matter! Many elements can continue lowering their energies by forming larger and larger molecules. A large molecule where the atoms are arranged in perfect periodic fashion is called a crystal. The elements to the left side of the periodic table have a few extra electrons in the outermost shells: they are mostly empty, and can lower their total energy by sharing electrons with nearby atoms. So they typically form crystals at room temperature – these crystals are called metals. The shared electrons in metals can move across the crystal easily, giving metals their high electrical conductivity. To the right of the periodic table are atoms such as O, N, Cl, and Fl that have mostly filled shells, these need a few electrons to lower their energies – which makes them form small molecules such as O2 and N2 , and not condense into a crystal, but exist as a gas at room temperature. Near the middle of the periodic table, in group IV, the atoms C, Si, Ge tend to structurally form crystals like metals, but because they have exactly half-filled electronic shells, are electrically not conductive as the electrons are all used up in forming the chemical bond, with none left over for conduction. They are called semiconductor crystals, and are the class of materials that this book is primarily concerned with.
3.12 Fermion and boson statistics Atoms and small molecules have a small number of electrons. But in very large molecules, or crystals as discussed in the last paragraph, there are a large number of energy orbitals available for electrons to occupy. Their distribution must follow a statistical process that mini-
3.12
mizes the energy cost. For example, consider 1022 electrons in a solid, and 1023 orbitals – how do electrons choose which orbitals to occupy? In the presence of large number of electrons, the Pauli exclusion principle leads to an occupation probability of quantum states that is obtained by the maximization of the thermodynamic entropy. The result was first derived by Fermi (Fig. 3.29) and Dirac for non-interacting electrons. It is derived in Chapter 4 from thermodynamic arguments coupled with the Pauli exclusion principle resulting from the waveparticle duality. The resulting distribution is called the Fermi–Dirac distribution 1 f FD ( E) = E−µ , (3.37) e kb T + 1 where µ is the chemical potential, which is a measure of the number of fermions, k b the Boltzmann constant, and T the absolute temperature. The chemical potential µ grows with the number of particles, and depends on the temperature. At T → 0 K, the Fermi–Dirac function assumes a unit step shape and the highest energy orbital that is occupied is called the Fermi energy EF . So at T = 0 K, the chemical potential is equal to the Fermi energy µ( T = 0 K) = EF . Note that the probability of a fermion occupying an orbital allowed by the Schrodinger equation cannot exceed 1, enforcing the exclusion ¨ principle. If a measurement to locate the fermion in a single orbital is made, the result will always be 0 or 1, not a fraction. But if a very large number of exact copies of the system are prepared, and the measurement to locate the fermion in a particular orbital is made on each system, the average occupation number will be given by the Fermi–Dirac distribution. In a single system if there are a large number of degenerate energy orbitals, then the occupation probability is the time-averaged value. This is the meaning of the distribution function. The equivalent quantum-statistical result for bosons is f BE ( E) =
1 e
E−µ kb T
,
Fermion and boson statistics 45
Fig. 3.29 Enrico Fermi made significant contributions to virtually all fields of physics. He was awarded the Nobel Prize in 1938.
Fig. 3.30 Satyendra Nath Bose, who discovered the statistics for photons. Particles that follow such statistics are called bosons.
(3.38)
−1
where µ is the chemical potential. The Bose–Einstein distribution allows values larger than 1. Dramatic effects such as the Bose–Einstein condensation (BEC), lasers, and the existence of superconductivity occurs when many bosons can co-exist in the same quantum state. The bosons can be composite particles, for example Cooper pairs in superconductors that are electron-phonon-electron quasiparticles where electrons are ”glued” together by phonons. The Fermi–Dirac and Bose–Einstein statistics have a delightful history of development. The Bose–Einstein distribution was arrived at by analysis of the thermodynamic properties of a large number of light quanta, or photons, around 1924. Immediately after Pauli unearthed the exclusion principle, first Fermi, and then Dirac, arrived
Fig. 3.31 A timeline of the development of quantum mechanics and its applications to crystalline solids.
46 Quantum Mechanics in a Nutshell
Fig. 3.32 Indistinguishable particles can be of two types: Bosons, or Fermions. They have very different properties!
at the same distribution from separate angles. It was clear that the Bose–Einstein distribution applied to photons, and through models of vibrations of solids, was also known to apply to the thermodynamics of matter waves. But it was not immediately clear what the Fermi–Dirac distribution was to be applied to! Letters exchanged between Pauli and Heisenberg at that time show how Pauli was slowly convinced that the Fermi-Dirac distribution was the correct one for electrons and electronic phenomena in solid crystals. The first application of the Fermi-Dirac statistics was made by Pauli to explain the phenomena of paramagnetism of solids, which is a weak attraction nonmagnetic solids feel towards magnets. Pauli’s paramagnetism effectively launched solid-state physics and the quantum theory of solids. Arnold Sommerfeld was deeply impressed by Pauli’s use of the FermiDirac statistics to explain paramagnetism. He set out to fix the nearly three-decade old Drude model of metals. Sommerfeld had spectacular success in fixing the failings of the Drude model by simply changing the classical Maxwell Boltzmann distribution e− E/kb T to the new fully quantum Fermi–Dirac statistics of electrons. We will encounter this story in Chapter 5.
3.13 The Spin-statistics theorem In addition to linear momentum p = h¯ k and angular momentum L = r × p, electrons also possess an extra bit of spin angular momentum. In semiconductors, electron spin plays an important role in the electronic band structure. The net angular momentum of electron states is obtained by adding the various components of the angular momenta.
3.14 The Dirac equation and the birth of particles 47
The exclusion principle is central to the spin-statistics theorem from relativistic quantum field-theory. Pauli proved the spin-statistics theorem rigorously in 1940, though its validity was well known by that time. It states that bosonic particles have integer spins, and fermonic particles have half-integer spins. That means bosons have spins S = 0, ±h¯ , ±2¯h, ..., and fermions have spins S = ±h¯ /2, ±3¯h/2, .... Electrons have spin13 ±h¯ /2. The fundamental dichotomy of particles in nature has received increasing attention the last three decades. Quasi-particle states have been observed (for example in the fractional quantum Hall effect) that behave neither like fermions nor bosons. Swapping the single-particle states for such quasi-particles leads to the accumulation of a phase factor: ψ( x2 , x1 ) = eiφ ψ( x1 , x2 ).
13 A more accurate statement is that
when the spin angular momentum of the electron is projected along any axis, it can take the values ±h¯ /2. This applies to all the earlier statements in this paragraph.
(3.39)
These particles evidently satisfy the indistinguishability criteria, but accumulate a(ny) phase, leading to their name anyons. Exercise 3.10 explores some features of anyons further. Anyon states can exhibit a richer range of statistics than fermions and bosons. For anyons, commuting (or Abelian) statistics has similarity to fermions and bosons, but non-commuting (or non-Abelian) statistics does not have such an analog. Non-Abelian anyons are of current interest due to their proposed usage in topological quantum computation.
3.14 The Dirac equation and the birth of particles Dirac (Fig. 3.33) was not comfortable with Schrodinger’s equation ¨ since it was not consistent with relativity, and did not predict the spin of electrons. He was able to reformulate (see Exercise 3.9) the quantum mechanics of electrons from Schrodinger’s equation ¨ ∂|ψi pˆ 2 =[ + V (r, t)]|ψi ∂t 2m
(3.40)
∂|ψi = [cαˆ · pˆ + βmc2 + V (r, t)]|ψi ∂t
(3.41)
i¯h to the Dirac equation i¯h
where c is the speed of light, and αˆ , β are the famous 4 × 4 Dirac matrices. The p total energy for the free electron when V (r, t) = 0 is given by E = ( pc)2 + (mc2 )2 , which is consistent with special relativity. Before Dirac, the concept of a ”particle” was not very clear. Dirac’s assertion was to the effect: ”a particle is the solution of my equation”. Dirac’s equation described the electron energy spectrum with more accuracy than Schrodinger’s equation, and accounted for spin ¨ naturally. It also predicted the existence of negative energy states, or
Fig. 3.33 Paul Dirac unified quantum theory with Einstein’s special relativity and discovered an equation that showed why the electron must have spin angular momentum. His equation gave birth to most of particle physics, and topological aspects of condensed matter physics. Shared the physics Nobel prize in 1933 with Schrodinger. ¨
48 Quantum Mechanics in a Nutshell
Fig. 3.34 The positron, or the antielectron, was first anti-matter particle that was predicted to exist by the Dirac equation. It was discovered by Carl Anderson in cloud chamber tracks in 1934.
anti-electrons. This was the first prediction of antimatter. A few years after the prediction, such particles were discovered in cloud chambers by Carl Anderson (Fig. 3.34); these particles are called positrons. Electrons and positrons annihilate each other, emitting light of energy h¯ ω = 2m0 c2 . The concept of a hole in semiconductors, which will be covered in quite some detail in this book has many analogies to a positron, but it is not quite the same. Positrons are used today in Positron Emission Tomography (PET) for medical imaging, and to also detect missing atoms (or vacancies) in semiconductor crystals. The philosophy of Dirac that ”particles are solutions to equations” gave rise to the prediction of a number of new particles that have since been observed such as quarks, gluons, Higgs boson, etc... Majorana fermions fall under the category of predicted exotic particles, and there is intense interest in realizing such exotic states in matter for topological quantum computation. What was exotic yesterday will become commonplace tomorrow, so keep track of those ”particles”!
3.15 Chapter summary section In this chapter, we have learned:
• All objects with momentum p have a wavelength λ = hp , where h is the Planck’s constant. This is the de Broglie relation of the wave-particle duality. It holds for massless photons for which p = h¯ ω/c, and also for particles with mass such as electrons for which p = √ me vv 2 ≈ me v for v µ, marking the quantum to classical transition.
The distribution may be thought of a function of the energy E, or of the chemical potential µ. We use the compact notation f 0 = f 0 ( E − µ) = f FD ( E). The partial derivative with respect to energy is ∂ f0 ∂f e β( E−µ) = − 0 = −β · = − β · f 0 [1 − f 0 ], ∂E ∂µ (1 + e β ( E − µ ) )2
(4.17)
which can be rearranged to the form
−
∂ f0 ∂f β =+ 0 = . 2 β( E−µ) ∂E ∂µ 4 cosh ( 2 )
(4.18)
Because cosh2 ( x ) ≥ 1, the derivative of the Fermi–Dirac distribution β reaches its maximum value of 4 = 4k1 T at E = µ. The integral of b R +∞ this function −∞ duβ/(4 cosh2 [ 12 βu]) = 1 indicates that in the limit of very low temperatures k 1T = β → ∞, the derivative function should b approach a Dirac-delta function in the energy argument9 , i.e., ∂ f0 ∂f lim [− ] = lim [+ 0 ] = δ( E − µ). ∂E ∂µ T →0 T →0
(4.19)
This feature is illustrated in Fig. 4.10. Now considering f (u) = 1/(1 + eu ) and f (v) = 1/(1 + ev ), we get the identity f (u) − f (v) = [ f (u) + f (v) − 2 f (u) f (v)] × tanh( | {z } ≥0
v−u ). 2
(4.20)
9 Because of the Pauli Exclusion princi-
ple, the electrons in a solid that contribute to electrical conductivity are from a small window of energies where the Fermi derivative function peaks: the electronic conductivity σ is proportional ∂f to − ∂E0 . This is because states of lower energy are completely occupied, and states with high energies have no electrons. This is crudely the same effect as when we blow air on a bowl of water, the surface responds, not the interior. This is derived quantitatively in later chapters.
68 Damned Lies and Statistics 1.2
30
1.0
25
0.8
20
0.6
15
0.4
10 0.2
5 0.0 - 0.4
- 0.2
0.0
0.2
0.4
- 0.4
- 0.2
0.0
0.2
0.4
Fig. 4.10 Illustration of (a) the temperature dependence of the Fermi–Dirac distribution, and (a) its derivative. While the Fermi–Dirac distribution approaches a unit-step at low temperatures, its derivative takes becomes sharply peaked, approaching the Dirac-delta function in the limit T → 0K.
10 For example, in Chapter 29 the switch-
ing of sign of the Fermi difference function will be critical to the creation of population inversion in a laser.
Since for fermions f (u), f (v) ≤ 1, the term in the square brackets is always positive. So the sign of the Fermi difference function is determined by the tanh(...) term. The Fermi difference function will appear repeatedly when we study the optical and electronic transport properties of semiconductors and electronic and photonic devices10 . The integral of the Fermi–Dirac function is Z ∞ 0
dE f 0 ( E − µ) =
Z ∞ 0
dE 1 = ln(1 + e βµ ), β 1 + e β( E−µ)
(4.21)
which leads to the very useful Fermi difference integral Z ∞
11 The Fermi difference function will ap-
pear later in the problem of electrical current flow. Electrons in a metal or a semiconductor are subjected to two Fermi levels when connected to a battery. The electrons in the Fermi difference function window are those responsible for current flow.
1 1 + e− βµ1 ln[ ]. β 1 + e− βµ2 0 (4.22) If µ1 , µ2 >> k b T, the second term on the right side is zero11 , and we obtain Z dE[ f 0 ( E − µ1 ) − f 0 ( E − µ2 )] = (µ1 − µ2 ) +
∞
dE[ f 0 (µ1 ) − f 0 (µ2 )] ≈ (µ1 − µ2 ).
(4.23)
That this relation is an identity is evident at T → 0, or β → ∞. Features of the Fermi difference function are illustrated in Fig. 4.11. The integral at low temperatures is just the area under the dashed difference curve, which is rectangular and has a energy width of µ2 − µ1 . It will prove very useful later to define higher moment integrals of the Fermi–Dirac function of the form Fj (η ) =
1 Γ ( j + 1)
Z ∞ 0
du
uj . 1 + eu−η
(4.24)
4.6
1.2
1.2
1.0
1.0
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0.0
0.0 - 0.4
- 0.2
0.0
0.2
0.4
- 0.4
Properties of the distribution functions 69
- 0.2
0.0
0.2
0.4
Fig. 4.11 Illustration of the temperature dependence of the Fermi difference distribution at (a) T = 77K and (b) at T = 300K. The difference function peaks in an energy window µ1 ≤ E ≤ µ1 . It becomes increasingly rectangular as the temperature drops.
The Fermi–Dirac integral is rendered dimensionless by scaling the chemical potential η = βµ, and the energy u = βE by the thermal energy k b T = β1 . Since we are integrating over u, the Fermi–Dirac integral Fj (η ) is a function of the chemical potential µ. The denominator is R∞ a normalizing Gamma function Γ(n) = 0 x n−1 e− x dx with the property Γ(n + 1) = nΓ(n), which means if n is an integer, Γ(n) = (n − 1)!. A useful√value of the Gamma function for a non-integer argument is Γ( 12 ) = π. For η > +1, an excellent approximation to leading order η j +1
is Fj (η ) ≈ Γ( j+2) . Due to the high importance of Fermi–Dirac integrals in semiconductor devices, we collect the results12 : Fj (η ) =
1 Γ ( j + 1)
Z ∞ 0
du
η j +1 uj ≈ . , Fj (η ) |{z} ≈ eη , Fj (η ) |{z} u − η Γ ( j + 2) 1+e η 1
(4.25) From Equation 4.21, we have an exact analytical result for the Fermi–Dirac integral for j = 0: it is F0 (η ) = ln(1 + eη ). The validity of the approximations in Equation 4.25 are easily verified for this special case. No exact analytical expressions for other orders (j 6= 0) exist. The approximations in Equation 4.25 then assume increased importance for analytical evaluation of various physical quantities such as the mobile carrier densities in semiconductor bands, transport phenomena, and optical properties. The order j depends on the dimensionality of the problem. Fig. 4.12 illustrates the cases of the Fermi–Dirac integrals and their approximations for the cases of j = 0 and j = 12 .
70 Damned Lies and Statistics
1000
100
10
100
1
10 0.100
1 0.010
0.10
0.001
10- 4
-5
0.01
5
-5
5
10
15
Fig. 4.12 Fermi–Dirac integrals and their non-degenerate (η > 1) approximations, illustrating Equation 4.25. (a) The integral F 1 (η ) and its non-degenerate and degenerate approximation. (b) The integral Fj (η ) for several j, and can be 2 used for looking up the numerical values of the integrals.
Table 4.1 The Gamma function Γ(n) and the Riemann zeta function ζ (n) for a few arguments. n
Γ(n)
ζ (n)
−2 − 32 −1 − 12 0 + 12 +1 + 32 +2 + 52 +3 + 72 +4 + 92 +5
∞√
0 −0.025 1 − 12 −0.208 − 12 −1.460 ∞ +2.612
4 π 3
∞ √ −2 π ∞ √ π 1√ π 2
1√
3 π 4
2
√
15 π 8
6
√ 105 π 16
24
π2 6
+1.341 +1.202 +1.127 π4 90
+1.055 +1.037
We will rarely need higher order terms of the Fermi–Dirac integral other than the approximations in Equation 4.25. For the rare cases where they are necessary, it has been shown that the Fermi Dirac integral is given by the complete expansion
Fj (η )
∞
= |{z}
η >0,j>−1
∞ 2t2n (−1)n−1 −n·η η j+1−2n + cos(πj) ∑ e , Γ( j + 2 − 2n) n j +1 n =0 n =1
∑
(4.26) where t0 = 1/2, tn = (1 − 21−n )ζ (n), and ζ (n) is the Riemann zeta function. Some representative values of the Riemann zeta function and the Gamma function are listed in Table 4.1. We also note that for j = 0, F0 (η ) = ln(1 + eη ) is an exact result. Since the Fermi–Dirac integrals follow the identity ∂ F (η ) = Fj−1 (η ), ∂η j
(4.27)
and F0 (η ) = ln(1 + eη ), the Fermi–Dirac integral is analytical for j = η ∂ 0, −1, −2, .... For example, F−1 (η ) = ∂η F0 (η ) = 1+e eη , and so on. In the degenerate case, η > 0, the second summation term on the right in Equation 4.26 can be neglected because it drops as powers of e−η , and
4.7 Quantum twist on thermodynamics 71
the first sum can be approximated to be Fj (η )
≈ |{z}
η >>0,j>−1
η j +1 π 2 j −1 + η + .... Γ( j + 2) 6Γ( j)
(4.28)
Table 4.2 Approximations for the Fermi–Dirac integral in the degenerate limit η >> 0 for a few orders. The second, and subsequent terms of the expansions are much smaller than the first when η >> 0.
Table 4.2 lists the degenerate approximations of the Fermi–Dirac integral up to the second term for a few orders. Note that Γ(0) is not defined, but is not necessary since F0 (η ) = ln(1 + eη ) is exact.
4.7 Quantum twist on thermodynamics
h Ei1d =
−∞
dv x · ( 12 mv2x ) · e
R +∞ −∞
dv x · e
−
−
1 mv2 x 2 kb T
=
1 mv2 x 2 kb T
1 k T. 2 b
h Ei3d =
−∞
dv x
R +∞
R +∞
1 2 −∞ dvy −∞ dvz · ( 2 m ( v x
R +∞ −∞
dv x
R +∞ −∞
dvy
R +∞ −∞
=⇒ h Ei3d =
+ v2y
dvz · e
−
− + v2z )) · e
− 12 0
≈ √2π η 2 − π122 η − 2 + ... = ln(1 + eη ) ≈ η
+1 + 32 +2 + 52 +3
3
3
3
≈ ≈ ≈ ≈ ≈ ≈
3 1 4 √ η 2 + π62 η − 2 + ... 3 π 1 2 π2 2 η + 6 + ... 3 5 1 8 √ η 2 + π32 η 2 + ... 15 π 2 1 3 π 6 η + 6 η + ... 3 7 3 16 √ η 2 + 2π 2 η 2 + ... 9 105 π 2 1 4 π 2 12 η + 12 η + ...
ample for electrons in vacuum m is the free electron mass me , but for electrons in solids it is the effective mass m? as discussed in future chapters. The results of this section are completely general, and are not limited to electrons. For example m can be the mass of N2 molecules for Nitrogen gas in a box.
(4.29)
1 m ( v2 + v2 + v2 ) x y z 2 kb T
1 m ( v2 + v2 + v2 ) x y z 2 kb T
1 1 1 3 k T + k b T + k b T = k b T. 2 b 2 2 2 (4.30)
Note that each degree of freedom gets k b T/2 energy. This is the equipartition of energy of classical thermodynamics. Now we must fix this result of the equipartition of energy to account for the quantum statistics that are obeyed by fermions and bosons. The procedure is as simple as replacing the Maxwell–Boltzmann distribution with the correct statistics for quantum particles. For non-interaction fermions, we get for 1D the quantum result
1
13 A note about the mass term m: for ex-
If the particles are free to move in three dimensions, they can have velocities v = (v x , vy , vz ), leading to an average energy
R +∞
Fj (η )
+ 12
A hallmark result of classical mechanics and classical thermodynamics is the principle of the equipartition of energy. This result asserts that if a classical system allows several modes (or degrees of freedom), the average energy allotted to each mode is exactly 12 k b T. The result is evident if we consider a large number of particles of the same mass m that are free to move in one dimension (=1 degree of freedom)13 . The kinetic energy of a particle moving with velocity v x is then E = 1 2 2 mv x . The Maxwell-Boltmann distribution function asserts that for classical particles in equilibrium with a reservoir at temperature T, to minimize the system energy, the probability of finding a particle with a kinetic energy E = 12 mv2x is f ( E) = exp [− E/(k b T )]. This is a Gaussian distribution in velocity that attaches the highest probability to the slowest moving particles. The average energy is
R +∞
j
72 Damned Lies and Statistics
1
h Ei1d =
10 - 1
10 - 2
R +∞ −∞
dv x · ( 12 mv2x ) ·
R +∞ −∞
dv x ·
e
1
e 1
1 mv2 −µ x 2 kb T
1 mv2 −µ x 2 kb T
+1
=
F (η ) 1 k T · 1/2 . 2 b F−1/2 (η )
(4.31)
+1
To evaluate the integrals, we have used a change of variables E = and the definitions of the Fermi–Dirac integrals defined in Equation 4.25, and η = µ/(k b T ). The Gamma functions give the factor 1 2 2 mv x ,
10 - 3
of 1/2 in front because
limit η 0K) 6= EF ( T = 0K). In semiconductors, the number of fermions may split between different bands as electrons and holes. In such cases, it is convenient to define a Fermi level at finite temperatures at thermal equilibrium, which is different from µ = EF ( T = 0K). Furthermore, to mimic systems out of thermal equilibrium such as a semiconductor device through which a current is flowing, the Fermi level will split into quasi-Fermi levels, each controlled by a separate contact or source of Fermions. Be aware of these variants of the concept of chemical potential of fermions, as they will appear in Chapter 5 and beyond. (c) For bosons at T = 0K, argue why the concept of a chemical potential is not as straightforward as that for fermions. The most common bosons that we encounter in semiconductor physics are photons and phonons. Their number is not conserved: they may be created or destroyed. Argue why for them, µ = 0. Therefore, for photon and phonons of energy h¯ ω, the Bose–Einstein distribution simplifies to f BE (E ) = h¯ ω1 . e k b T −1
(4.5) The Boltzmann limit of switching Electrons follow the Fermi–Dirac distribution in allowed energy bands of semiconductor based diodes and transistors that are used for digital logic circuits. It is experimentally observed that the current carried by these electrons under certain conditions depends on a voltage control V and qV
temperature T as I ∝ e kb T . This is because the current is proportional to the number of electrons at energies much larger than the Fermi level EF . (a) Make a connection to the number of electrons available to carry current with the Fermi–Dirac integrals shown in Fig. 4.25.
Exercises 79 (b) Show that irrespective of the orders of the Fermi–Dirac integrals, the number of electrons EF
available to carry current changes as e kb T when the Fermi energy is much lower than the minimum allowed electron energy. (c) Assuming that the Fermi energy EF can be directly changed with energy qV supplied from a battery of potential V, explain the experimental qV
observations I ∝ e kb T . (d) Show that to increase the current from I1 to I2 , we must apply a voltage V2 − V1 = kbqT ln( II21 ). (e) In particular, to increase the current by a factor of 10 at room temperature T = 300 K, show that we must apply a voltage of 60 mV. This is known as the Boltzmann limit for switching. In conventional transistors and diodes, this relation limits how abruptly one can switch diodes and transistors, and sets a lower limit of energy required to do useful operations with such devices. (4.6) The Joyce-Dixon approximation In this chapter we have seen how to obtain the carrier concentration n when the Fermi level EF is known. For example, in 3D, the relation is n = Nc F 1 ( EFk−TEc ), where NC is an effective bandb 2 edge density of states. Sometimes, we also need to know the inverse relation, meaning if we know the carrier density at a certain temperature, we wish to know where is the Fermi level. (a) Prove the Joyce and Dixon result for 3D degenerate semiconductors, when η = EFk−TEc >> +1 b and Ec is the conduction band edge energy,
√ EF − Ec n 1 n 3 3 n 2 = ln( ) + √ −( − )( ) + ... kb T Nc 16 9 Nc 8 Nc (4.34) (b) Find the analogs of the 3D Joyce-Dixon result for 2D and 1D cases. (4.7) The Dulong–Petit law The specific heat capacity of a solid at high temperatures was known to follow the empirical Dulong–Petit law, for which classical thermodynamics provided an explanation. The Dulong–Petit law states that the specific heat capacity of a solid is cv = 3nk b , where n is the atomic density, and k b the Boltzmann constant. (a) Show that a mass M with a spring constant
K when pulled a distance x from its equilibrium position has a total energy that is exactly twice the average kinetic and average potential energies. (b) Assume the solid is composed of N atoms in volume V tied to each other through springs. Because of the mass-spring system, each atom is allowed to vibrate in three directions around their mean positions, and they also have potential energy due to the springs. Argue why the total energy absorbed by the solid at a temperature is U = 2 · N · 32 k b T = 3Nk b T. (c) Now show that the specific heat capacity U = 3nk , i.e., it is independent of temcv = V1 ∂∂T b perature. Calculate the specific heat capacity for n ∼ 1023 /cm3 which is a typical density of atoms in a solid, and compare with experimentally measured values at high temperatures. (d) Investigate when does the Dulong Petit law applies to solids, and under what conditions it breaks down. Find the typical specific heat capacities of semiconductors such as silicon and germanium, and compare with the prediction of the Dulong–Petit law. The deviations from this relation are due to the quantization of lattice vibrations, and is the topic of the next two problems. (4.8) Einstein’s theory of heat capacity The experimentally measured specific hear capacity of solids in the late 19th century was found to be much smaller than the values the classical Dulong–Petit result predicted. And it was not universal either, it varied from solid to solid. In 1907, Einstein solved this mystery by applying the idea of quantization for the first time to vibrations in solids. This work is considered to be the first quantum theory of solids. In this problem, we re-do Einstein’s derivation and in the process also gain an intimate acquaintance with heat and phonons in solids. (a) Instead of Dulong–Petit’s assumption of the classical equipartition of energy for N atomic oscillators, Einstein considered each atom is allowed a unique mechanical oscillation frequency ω0 . Drawing from Planck’s prescription for the blackbody radiation (see Exercise 3.2), he postulated that the energies allowed for the mechanical vibration of the atom are not continuous, but quantized according to En = n¯hω0 , where n = 0, 1, 2, .... Show that in this case, assuming the Boltzmann statistics for
80 Exercises
occupation probability of energy e energy for each atom is
− kEnT b
, the average
h¯ ω0
. (4.35) e −1 (b) Show that because each atom can move in 3 directions, and there are N atoms in volume V where n = N/V, the total heat energy per unit volume stored in the atomic vibrations of the solid is
h Ei =
uv =
h¯ ω0 kb T
3N (h¯ ω )/k b h Ei = 3nk b · h¯ ω00 . V e kb T − 1
(4.36)
∂uv ∂T .
(c) The specific heat capacity is cv = Make a plot of this quantity vs T and compare with Fig. 4.18. In particular, show why when T >> h¯ ω0 /k b , cv → 3nk b , and the Dulong–Petit law is recovered. Also show why as T → 0 K, cv → 0 as well. This is how Einstein solved the heat-capacity mystery. (d) There is a direct analogy between Einstein’s resolution of the specific heat anomaly and Planck’s resolution of the ultraviolet catastrophe. Planck’s resolution involved quanta of light (photons), and Einstein’s quanta of mechanical vibrations in solids are called phonons. Draw the analogies between the two theories. Walther Nernst, the person behind the third law of thermodynamics was so impressed by Einstein’s explanation of the specific heat capacity that he travelled from Berlin to Zurich just to meet him in person. Nernst’s fame and scientific clout is believed to have been instrumental in giving Einstein the first scientific break and high credibility. (4.9) Debye’s theory of heat capacity In spite of the success of the Einstein theory, the quantitative behavior of the specific heat capacity of solids followed a T 3 dependence as T → 0 K, and his theory did not explain this behavior. It also underestimated the value of cv . This experimental fact was explained by a modification to Einstein’s theory in 1912 by Peter Debye. Instead of assuming all atoms vibrate at the same frequency ω0 , Debye argued that atoms should be allowed to vibrate over a range of frequencies ω. The frequencies ω are related to the wavelength of sound waves carried by the solid, ω = vs k where vs is the speed of sound, and k = 2π/λ where λ is the wavelength. (a) By fitting of sound waves in a 3D solid of volume V and N atoms of atomic density
n = N/V and counting the total number of modes, show that the maximum energy of vi1 bration is h¯ ω D = h¯ vs (6π 2 n) 3 . The frequency 1 2 ω D = 2πνD = vs (6π n) 3 is called the Debye frequency of vibrations of a solid. Estimate νD in a few solids and show it is in the range of 1012 Hz. (b) By summing over the energies of the vibrational modes and using the density of states to go from a sum to an integral, show that the energy per unit volume is Z
θD T
x3 , x −1 e 0 (4.37) where p is the number of modes of the sound wave. For a 3D solid, there is one longitudinal and two transverse polarizations for each mode, so p = 3. uv = p · (3n) · (k b θ D ) · (
T 4 ) · θD
dx
1.2 1.0 0.8 0.6 0.4 0.2 0.0 0.0
0.2
0.4
0.6
0.8
1.0
Fig. 4.18 The specific heat capacity cv of solids as a function temperature. The classical Dulong–Petit result is reached only at high temperatures. Einstein first explained why cv → 0 as T → 0 K by invoking quantization of lattice vibrations. Debye refined it to explain the T 3 variation of cv as T → 0 which was experimentally observed, but not explained in the Einstein theory. Both the quantum theories approach the classical Dulong–Petit law cv = 3nk b at high temperatures.
(c) The derivative of Equation 4.37 with temperav ture gives the specific heat capacity cv = ∂u ∂T . Make a plot of the specific heat capacity as calculated by the Debye formula against the Einstein formula, and compare with Fig. 4.18. (d) Show that as T → 0 K, the Debye formula gives cv → T 3 , which is consistent with experimental
Exercises 81 data. Debye’s vibrational modes with ω = vs k are now identified and measured for solids as the acoustic phonon modes, and Einstein’s ω = ω0 constant modes are close to the optical phonon modes. They had no experimental means to identify and distinguish between these modes at that time. The Einstein and Debye formulations of the specific heat capacity of solids were the first applications of quantum mechanics to solids and matter. Note that Schrodinger and Heisenberg’s for¨ mulation of quantum mechanics, the Bose–Einstein and Fermi–Dirac statistics, and harmonic oscillator problem of quantum mechanics only appeared more than a decade after the quantum theories of specific heat capacity, so none of those concepts are critical to truly explain the behavior of specific heat of solids! (4.10) Rotational and vibrational degrees of freedom When the heat capacity of a gas of N hydrogen molecules of density n = N/V in a 3D box of volume V is measured as a function of temperature, the heat capacity per unit volume is found to go through steps as shown in Fig. 4.19: at the lowest temperatures it is cv = 32 nk b , which increases to cv = 52 nk b above T = θrot and to cv = 72 nk b above T = θvib . This mysterious behavior was explained by quantum mechanics, and the equipartition of energy. (a) Imagine each hydrogen molecule crudely as a point mass. Explain why because of translational degrees of freedom of this point mass, in a dilute concentration limit when the molecules are far apart from each other, the specific heat should be 3 v cv = ∂u ∂T = 2 nk b .
(b) Now imagine each hydrogen molecule as a dumbbell. Then, argue why in addition to the three translational degrees of freedom, two degrees of freedom appear due to rotation. The molecule can rotate around the line joining the atoms, or around the perpendicular bisector of this line. If the distance between the atoms does not change, this is called a rigid rotor. From quantum mechanics, it is found that the rotational energy is quantized, taking values h¯ ωrot l (l + 1) = k b θrot l (l + 1), where l is an integer, and θrot a characteristic temperature for rotation. Explain why the specific heat increases by 22 nk b , and why this happens above T = θrot (c) Finally, the atoms in the molecular dumbbell can also vibrate with respect to each other. A 1dimensional harmonic oscillator has 2 degrees of freedom as its state may be defined by the position and velocity. Argue then why the specific heat will further increase by an additional 22 nk b for T > θvib . (d) For hydrogen, experiments show that θrot ∼ 85 K, and θvib ≈ 6332 K. The gas can sometimes dissociate before such high temperatures are reached. There are further modes of energy such as electronic energies that can get excited at high temperatures. Do some research on this very high temperature behavior and compare with other gases. (4.11) Interacting fermions and bosons The Fermi–Dirac and Bose–Einstein distributions were derived under the assumption of noninteracting fermions and bosons. In the presence of interactions, the distributions functions are modified. Write a short paragraph on what are the modifications, and what are the experimental signatures of such modifications. (4.12) Useful integrals for magnetotransport Here we encounter two integrals related to the Fermi–Dirac function: (a) Show that for β = 1/(k b T ) and f 0 ( E) = 1/(exp [ β( E − µ)] + 1) the Fermi–Dirac function, Z +∞ −∞
πk
dE · (−
∂ f0 β ) · eikE = . ∂E sinh ( πk β )
(4.38)
(b) Show that Z +∞ −∞
Fig. 4.19 Specific heat variation of a diatomic gas as a function of temperature.
dE · (−
∂ f0 E π ) · cos (2πs − )= ∂E h¯ ωc 4
2πs2 h¯kbωTc
sinh (2πs2 h¯kbωTc )
· cos (2πs
µ π − ). (4.39) kb T 4
82 Exercises These integrals will play an important role in quantum magnetotransport properties discussed in Chapter 25. (4.13) Stirling approximation A large part of statistical mechanics hinges on the Stirling approximation ln( N!) ≈ N ln( N ) − N = N ln( N/e) for large N. Here is one of many ways to derive the result. (a) Plot ln( N!) and N ln( N ) − N for N = 1 − 105 and see how good is the approximation.
R∞ (b) Consider the integral 0 dxe−αx = α1 . Differentiate both sides R ∞ ( N + 1) times with respect to α to prove that 0 dxx N e−αx = αN! N +1 . Putting α = 1 gives an integral representation of N!. (c) Show that the integrand x N e− x with α = 1 peaks at xo = N. Prove that as N becomes large, the peak is very strong. (d) Prove that if a function f ( x ) isR sharply peaked at x = x0 , then the integral I = dxe f ( x) may be approximated by I ≈ e f ( x0 ) , if x = xn = n is the integer n, and ∆x = xn+1 − xn = 1. (e) Prove the Stirling formula using the above results, by writing x N e− x = e N ln( x)− x . (4.14) Entropy and limits of solar cells The concepts of entropy and the Fermi–Dirac and
Bose–Einstein distributions determine the maximum efficiency with which semiconductor solar cells can convert energy from light to electricity. These concepts are discussed in detail in Chapter 28. In this exercise, we appreciate these limits qualitatively. (a) Make a plot of the Planck distribution of photon number versus the photon energy at the temperature T = 5800 K of the sun’s surface. This is roughly the spectrum that reaches the outer atmosphere. (b) Suppose a semiconductor of bandgap Eg = 1.0 eV is used for making a solar cell. It does not absorb photons below its bandgap. Then calculate the fraction of the photons that can be absorbed by the solar cell. (c) Once the energy is converted from light to the energy of electrons and holes in the semiconductor, the energy distribution changes to a Fermi–Dirac function, which rapidly thermalizes with the temperature of the semiconductor atoms. A thermodynamic Carnot-type efficiency limit is then also defined based on the temperature of the sun and the semiconductor, as discussed in Chapter 28. Try to argue on heuristic grounds what this limit should be, and check the actual quantitative answer in Chapter 28.
Electrons in the Quantum World In this chapter we subject the electron to the laws of quantum mechanics embodied by the Schrodinger equation, and quantum statistics em¨ bodied by the Pauli exclusion principle as indicated in Fig. 5.1. We discover that by upgrading classical mechanics and thermodynamics to their quantum versions, we can explain a vast array of experimental facts for which the classical Drude model of the electron failed. The goals of this chapter are to explore:
• How does application of quantum mechanics and statistics to free electrons help reveal their wavefunctions ψ, momentum, energy, density of states, energy density, and quantum mechanical currents? • How do the above physical properties of electrons depend on the number of dimensions the electron is allowed to move in? Specifically, what are the similarities and differences of the properties for electrons moving in 1D, 2D, and 3D? • How did quantum mechanics and quantum statistics of free electrons resolve the discrepancies of the classical Drude model of the behavior of electrons in solids, and what phenomena could still not be explained? • A few exactly solved situations when electrons are not free, but form bound states due to potentials that trap them. In the process, we encounter a few exactly solved problems of quantum mechanics. The set of exactly solved problems is precious, because they form a rigorous underpinning on which rests the edifice of condensed matter physics and the physics of semiconductor nanostructures. We introduce the techniques to find the physical dynamics of single electrons – such as its momentum, energy, and the current it carries – all now in the significantly updated quantum version where the wave-nature is imposed on the electron from the beginning. Then we find that because of the Pauli exclusion principle, many-electron systems have an ”internal” energy that is both very large and very bewildering, because it has simply no counterpart in classical mechanics. We introduce the concept of the density of states, and our bewilderment turns to joy as we realize that we can not only explain the failures of the
5 ¨ 5.1 In Schrodinger equation we trust 84 5.2 The free electron
85
5.3 Not so free: Particle on a ring
87
5.4 The electron steps into a higher dimension: 2D 98 5.5 Electrons in a 3D box
105
5.6 The particle in a box
111
5.7 The Dirac delta potential
113
5.8 The harmonic oscillator
114
5.9 The hydrogen atom
114
5.10 Chapter summary section
115
Further reading
116
Exercises
116
84 Electrons in the Quantum World
Fig. 5.1 The major changes in the understanding of the electronic properties of solids involved quantum-facelifts of mechanics and statistics, but no changes in electromagnetism were necessary. The new quantum mechanical rules that govern the behavior of the electron are the point of discussion of this chapter.
Drude model, but have discovered a powerful new bag of tricks that explain and predict a far richer range of physical behavior of electrons in bulk materials and in nanostructures for which classical mechanics and thermodynamics are insufficient.
¨ 5.1 In Schrodinger equation we trust As we discussed in Chapter 3, all physically measurable information about the quantum states of the electron are buried in the state vector |ψi. By projecting the state vector to the real space we get the wavefunction ψ( x ) = h x |ψi. In this chapter, we learn how to extract useful information from ψ( x ) by applying the corresponding operators of physical observables on them. To do that, we have to first solve the time-independent Schrodinger equation for an electron in various ¨ potentials V ( x ):
−
h¯ 2 d2 ψ( x ) + V ( x )ψ( x ) = Eψ( x ). 2me dx2
(5.1)
The set of solutions h x |ni = ψn ( x ) will then be the eigenfunctions corresponding to states of definite energy with corresponding eigenvalues En . As we learnt in Chapter 3, the states of definite energy are also stationary states. They form the most convenient basis for describing the situations when the potential deviates from the ideal, i.e., if V ( x ) → V ( x ) + W ( x, t). Thus, the states of definite energy form the basis to uncover what happens when we perturb the quantum system. In the next sections we will examine the behavior of electrons in a few potentials V ( x ) that are central to understand the electronic properties of solids in general, and semiconductor physics in particular. For these potentials, we repeat the following procedure to extract physical information: 1 The Schrodinger equation can be solved ¨
exactly in an analytical form only for a very few potentials. We will cover most of them in this chapter. For general potentials, numerical solutions may be obtained but the computational cost may be high.
• solve the Schrodinger equation exactly to obtain the wavefunction ¨ ψ( x ) 1 , • the allowed momentum p x ,
• the allowed energy eigenvalues E, • the density of states g( E),
5.2
The free electron 85
• the total energy U , average energy u, and energy density uv of many electrons, and • the quantum mechanical current J.
We begin with the simplest of potentials: when V ( x ) = 0.
5.2 The free electron For V ( x ) = 0, the Schrodinger equation reads ¨
−
h¯ 2 d2 ψ( x ) = Eψ( x ). 2me dx2
(5.2)
The equation has the most general solution of the form ψ( x ) = Aeikx + Be−ikx , where k=
s
2me E h¯
2
=
2π . λ
(5.3)
(5.4)
We emphasize that the allowed wavelengths λ can take any value. Thus, the allowed k values are continuous for V ( x ) = 0. The allowed energy eigenvalues, also continuous, are E(k) =
h¯ 2 k2 . 2me
(5.5)
Fig. 5.2 shows this parabolic energy dispersion of the free electron. It would not be a stretch to say that this simplest energy dispersion is also one of the most important in all of condensed matter physics. The curvature of the parabola is inversely proportional to the mass of the quantum particle. If the particle is heavy, the energies are lower, you can imagine the parabola being pulled down by the heavy mass. Later we will see that the allowed energy eigenvalues of the electron in a semiconductor crystal will develop bands and gaps because of the formation of standing electron waves. Even so, within each electron band, the energy dispersion will again assume a parabolic form at the band edges, but with different effective masses than the free electron mass because of the presence of a periodic crystal potential. We note that the general solution in Eq. 5.3 represents a superposition of two waves: one going to the right (ψ→ ( x ) = Aeikx ) and the other to the left (ψ← ( x ) = Be−ikx ). Since it is a ”mixed” state, clearly it is not a state of a definite momentum. We verify this by operating upon the wavefunction by the momentum operator: pˆ x ψ( x ) = −i¯h
d ψ( x ) = −i¯h(ikAeikx − ikBe−ikx ) dx = h¯ k( Aeikx − Be−ikx ) 6= pψ( x )
(5.6)
Fig. 5.2 Free electron in 1D. The energy dispersion is parabolic, and all k and all E > 0 are allowed.
86 Electrons in the Quantum World
but for just the right-going state we get d ψ→ ( x ) = −i¯h(ikAeikx ) = h¯ kψ→ ( x ) = pψ→ ( x ) dx (5.7) which is a state of definite momentum h¯ k. For a right-going momentum eigenstate | + ki, whose wavefunction is ψ→ ( x ) = Aeikx , we find that the quantum charge current density upon using Equation 3.28 is pˆ x ψ→ ( x ) = −i¯h
J (+k) =
q h¯ k ? (ψ? pˆ x ψ→ − ψ→ pˆ x ψ→ ) =⇒ J (+k) = q| A|2 . 2me → me
(5.8)
Note that the units are in amps, because | A|2 has units of 1/length. Similarly, for a left-going state | − k i with wavefunction ψ← ( x ) = Be−ikx , the charge current density is J (−k) =
2 If you are uncomfortable with this
statement, I am with you. n ∼ | A|2 is true only if the particle is confined, as we will see in the next section. The completely free electron wavefunction is not normalizable!
q h¯ k ? ? (ψ← pˆ x ψ← − ψ← pˆ x ψ← ) = − q | B |2 . 2me me
From an analogy to the ”classical” charge current density J = qnv, where n ∼ | A|2 or n ∼ | B|2 is the particle density2 , we identify that h¯ k the state | + ki has a velocity m , and the mirror-reflected state | − ki e h¯ k has a velocity − me . The net current due to the right- and left-going states then is given by Jnet = J (+k) + J (−k ) = q · (| A|2 − | B|2 ) ·
3 A more modern formulation of the ve-
locity of a quantum state is v g (k) = q 1 h¯ ∇k E ( k ) + h¯ E × B( k ), where the second term appears due to an effective magnetic field in the momentum (or kspace). Here B(k ) is a Berry curvature of bands and E is the electric field. The Berry curvature of free electrons, and most semiconductor bands is zero. The second term causes currents perpendicular to the direction of the electric field under certain special cases.
(5.9)
h¯ k , me
(5.10)
which states that the net current going to the right is the difference of the two currents. If we now had a composite state ψ( x ) = C1 eik1 x + C2 eik2 x + ... = ∑n Cn eikn x , the net current is easily found as Jnet = q ∑n |Cn |2 h¯mken . In classical mechanics, a particle of mass m has a kinetic energy E = p2 /(2m), and a velocity v = p/m = dE/dp. In quantum mechanics, the particle has a wave-like nature. By analogy we cautiously3 define the group velocity of a quantum particle as proportional to the slope of the energy dispersion curve E(k) v g (k) = ∇p E(p) =
1 ∇ E ( k ). h¯ k
(5.11)
This definition will assume an increased importance when the E(k) dependence is different from that of a free electron; in Chapter 9 we will see that for electrons in a periodic crystal, this result remains exact, with the meaning of k being modified to that of a crystal momentum. Using the group velocity, we can write the charge current as J (k) = qv g (k ) f (k ), where f (k) is the occupation probability of state |ki. Using this procedure, we can find the quantum charge current carried by any superposition state |ψi = ∑k Ak |ki we can cook up. The free electron wavefunction cannot be normalized, because it extends over all space from −∞ ≤ x ≤ +∞. Physical quantities that have
5.3 Not so free: Particle on a ring 87
to do with density, such as the ”carrier density”, ”density of states” or the ”energy density” are ill defined for the completely free electron because of the infinite volume it lives in. To normalize it, we wrap the infinitely long line and join the infinities to form a circle. So we first put the electron in a circular ring to calculate these quantities.
5.3 Not so free: Particle on a ring Wavefunctions and Momenta: Fig. 5.3 (a) shows an electron restricted to move on a circular ring of circumference L, with V ( x ) = 0. Though it is not exactly a 1D problem, we assign one linear coordinate x to the particle’s location. We demand all solutions to the Schrodinger ¨ equation to be single-valued functions of x. Because the loop closes on itself4 , the electron wavefunction must satisfy ψ( x + L) = ψ( x ) → e
ik( x + L)
=e
ikx
→e
ikL
= 1 → kL = 2nπ
(5.12)
to be single-valued. This is only possible if kn =
2π n, where n = 0, ±1, ±2, ... L
(5.13)
where ψn ( x ) = Aeikn x . We at once see that for the particle on a ring, the set of allowed k n are discrete as indicated in Fig. 5.2, and thus the allowed values of the momentum are discrete: pn = h¯ k n =
h 2π h n=n , 2π L L
(5.14)
that is, the allowed values of the electron momentum are quantized. The smallest spacing of the allowed wavevectors is precisely ∆k = k n+1 − k n =
2π . L
(5.15)
Because the angular momentum is L = r × p, we find that Ln = r × p =
L L 2π¯h × h¯ k n zˆ = × nzˆ =⇒ Ln = n¯h, 2π 2π L
(5.16)
i.e., like the linear momentum, the angular momentum of the electron on a ring is also quantized, and can only take values equal to integer multiples of Planck’s constant ..., −2¯h, −h¯ , 0, +h¯ , +2¯h, .... We gain a physical intuition of what Planck’s constant h¯ actually means – it is a measure of the angular momentum of a quantum particle. For example, if I tie a 1 kg mass to a 1 m string and spin it at 1 m/s radial velocity, the angular momentum is Lcl = 1 J·s. So for this classical situation, I will be providing the mass n = Lcl /¯h ∼ 1034 quanta of angular momenta – and I may feel like Superman in the quantum world.
4 This periodic boundary condition is
referred to as the Born von-Karman boundary condition. It is mathematically distinct from the ”hard-wall” boundary condition that we impose for the particle in a box problem in Section 5.6, but the physics of the interior will not be affected by this choice. Exercise 5.4 makes this point clear. Furthermore, the circular ’ring’ geometry is an artificial construct for 1D confined electrons in a straight quantum wire. It serves to capture the physics with least mathematical complications. It is also no more artificial if a 1D wire is indeed wound into a ring.
88 Electrons in the Quantum World
But what this example really tells us is precisely how small a quantum of angular momentum actually is! As promised, unlike the free electron case, the eigenfunctions of the particle on a ring can be normalized: Z L 0
1 1 dx |ψn ( x )|2 = 1 → | A|2 × L = 1 → A = √ → ψn ( x ) = √ eikn x . L L (5.17)
Note that n = 0 is allowed as a result of the periodic boundary condition, and the probability density of this state |ψ0 ( x )|2 = 1/L is a constant in x, as is the probability of every state of definite momentum. We observe that the set of functions [..., ψn−1 ( x ), ψn ( x ), ψn+1 ( x ), ...] are mutually orthogonal because
hm|ni =
Z L 0
dx hm| x ih x |ni =
=
Fig. 5.3 Putting the electron on a ring quantizes the allowed wavevectors k n , and as a result the momentum, the angular momentum, and the energy of the particle are quantized. The density of states for the electron on a ring with √ parabolic energy dispersion goes as 1/ E, counting how the allowed eigenvalues distribute in energy if we were to put many electrons on the ring. 5 The Kronecker delta function is different from the Dirac delta function δ( x ), which takes a continuous set of arguments x, and can assume any value between 0 and ∞.
Z L 0
dx
Z L
e
? dxψm ( x )ψn ( x ) 0 2π i L (n−m) x
L
(5.18)
= δn,m .
Note this integration carefully, because it is a very important one, and a simple one, yet it remains an endless source of confusion to beRL x ginners. If n 6= m, the integral is 0 dx · ei2πN ( L ) where N = n − m is an integer. The value of the integral is zero because we are integrating cos(2πN Lx ) + i sin(2πN Lx ) over an integer number of periods as x : 0 → L, with equal positive and negative parts. But when n = m, RL 2π ei L (n−m) x = e0 = 1, and the integral is simply 0 dx L = 1. Because the integral is 1 when n = m and it is = 0 when n 6= m, we write it compactly using the Kronecker delta symbol δn,m . Note that the Kronecker delta function5 takes two discrete arguments n, m, and can only assume two values: 0 and 1. Couldn’t be any simpler – fear not the Kronecker delta δn,m – it is your friend! We also make a note that the set of linearly independent functions ψn ( x ) is complete, meaning if you give me any smooth function f ( x ) in 1D, I can write the function as f ( x ) = ∑n cn ψn ( x ) with suitable coefficients cn . This is what we mean by linear superposition: the amazing thing about quantum mechanics is the electron on the ring is allowed to be in any state that is a linear superposition of the eigenstates. Energy and Density of States: The allowed energy eigenvalues are E(k n ) =
2 h¯ 2 k2n (2π¯h)2 2 h = n2 = n = En . 2me 2me L2 2me L2
(5.19)
The energy eigenvalues are also quantized, and grow as n2 as seen in Fig. 5.3 (b). Because the electron is allowed to be in the n = 0 state, the minimum energy allowed is E = 0. This will not be the case if we put the particle in a box in Section 5.6. Two important physical intuitions we should take away from this example are:
5.3 Not so free: Particle on a ring 89
• The smaller the circle, the larger the allowed energies (L ↓ =⇒ En ↑), and • The smaller the mass, the larger the allowed energies (m ↓ =⇒ En ↑).
The first statement is one of quantum confinement: the smaller the space we fit a quantum particle into, the larger will be its energy. This is because the wavelength must become very small to fit in the space, which means high k = 2π/λ, and high energy E = (h¯ k )2 /(2me ). A glance at the density of energy eigenvalues along the energy (y)axis in Fig. 5.3 shows that they are more densely spaced at low energies, and become sparse at higher energies6 . We can guess that the dependence with energy must go as 1/Eγ , where γ > 0. To find the 1D density of states quantitatively, we note that between k → k + dk, there are dk 2π allowed states. For completeness, for each allowed k-state
6 The allowed energy eigenvalues are
equally spaced along the x- or k-axis since ∆k = 2π L .
L
we introduce a spin degeneracy gs , which is gs = 2 for free electrons of up and down spins, and a valley degeneracy gv which is the number of copies of such parabolic dispersions that may be present in the k-space7 . The total state density G1d ( E) in energy is then gs gv
2dk 2π L
G ( E) 2gs gv = G1d ( E)dE =⇒ g1d ( E) = 1d = L 2π dE dk
=⇒ g1d ( E) =
gs gv 2me 1 1 ( )2 √ . 2π h¯ 2 E
(5.20)
Because in the 1D ring, the electron could be moving clockwise or counterclockwise, we use 2dk to account for the two dk’s√for +k and −k wavevectors. We note that the 1D DOS decreases as 1/ E, and has a singularity as E → 0. Now instead of a single electron, if we fill the ring with N electrons, what would be the total energy of the ensemble? If there are N electrons in the ring, their 1D density is n1d = N/L. We will first completely neglect8 the Coulomb interaction between the electrons and assume they are non-interacting. Though this sounds like heresy, let me assure you that it is actually OK9 . Because electrons are fermions, the moment we go from one electron to two, the Pauli exclusion principle kicks in. Let us first address a subtle point about the physics of many fermions head-on. An essential point of clarification now is that if there are N electrons on a ring, we have a quantum many particle state, not a single particle state. A single particle quantum state that is in a superposition of many eigenstates is represented by a sum wavefunction that is a linear superposition of the form ψ( x ) = ∑k ck ψk ( x ), where ∑k |ck |2 = 1 ensures there indeed is a single, or 1 particle. But the wavefunction of a fermionic many particle quantum state is an anstisymmetrized product of single particle wavefunctions √ of the form ψ( x1 , x2 , ..., x N ) = ∑ p (−1) p ψk1 ( x1 )ψk2 ( x2 )...ψk N ( x N )/ N, where p is a permutation of xn ’s and k m ’s, compactly written as Slater determinants. We en-
7 For free electrons g = 1. In crystals v
there may be several equivalent energy valleys due to crystal symmetries, when gv 6= 1. We will encounter them later, but here we retain gv for generality. Also for free electrons gs = 2, but this degeneracy is broken in a magnetic field, or in magnetic materials as discussed in later chapters. 8 In crystalline solids, electrons can experience bizarre situations when they not only not repel each other, but attract! This counterintuitive mechanism is responsible for superconductivity.
9 The justification came from Lev Lan-
dau, and goes under the umbrella of what is called the Fermi-liquid theory. We will encounter it at a later point. Besides, the discussion here is equally applicable to neutrons – which are fermions that actually do not have charge.
90 Electrons in the Quantum World
Fig. 5.4 Enrico Fermi, a towering figure whose contributions extended into all areas of physics, both as a theorist and an experimentalist. Nobel prize recipient in 1938. Fermi and Dirac formulated the quantum statistics of particles that follow the exclusion principle - such particles are named fermions after him. He played a key role in nuclear physics, and led the team that produced an artificial chain reaction for the first time. 10 Note that the number (or density) of
electrons completely defines the Fermi wavevector k F and the Fermi energy EF at T = 0 K. This holds for all dimensions, and for all E(k ) relations.
countered this fundamental property of fermions in Chapter 3, Equation 3.34. The specific case of the 2-electron state √ has a wavefunction ψ( x1 , x2 ) = (ψk1 ( x1 )ψk2 ( x2 ) − ψk1 ( x2 )ψk2 ( x1 ))/ 2. The Pauli exclusion principle is baked into this antisymmetric wavefunction. Though the many particle electron wavefunction looks like a mess, the fact that they are non-interacting saves us from mathematical complications. The total momentum, energy, and current of the many particle distribution then becomes the sum of the momenta, energies, or currents of the single particle states ψk ( x ) it is made of. Chapter 4 also gave us the occupation function of the single particle state k at thermal equilibrium of the non-interacting many particle state after enforcing the Pauli exclusion principle as the simple result: The equilibrium electron occupation function of a distribution of a large number of non-interacting electrons follows the Fermi–Dirac distribution f ( Ek ). Armed with these powerful simplifying results for many electrons, let us look at the T = 0 K situation, when f ( E) = 1 for 0 ≤ E ≤ EF and f ( E) = 0 for E > EF . The electrons then must fill up to a Fermi wavevector k F and Fermi level EF as shown in Fig. 5.3 such that10
h¯ 2 π 2 n21d π 1d n =⇒ E ( 0 K ) = . 1d F 2π 2gs gv 2gs2 gv2 me L (5.21) This is a remarkable result: the ground state of the electron ensemble at T = 0 K already has a large amount of energy. For example, if n1d ∼ 108 /cm typical for a metal, then for gs = 2, gv = 1 the electron states with the highest energy have λ F = 2π/k F ∼ 0.4 nm and EF ∼ 10 eV. This is the energy picked up by an electron in a 10 Volt potential 11 , or if we were to provide this energy in the form of heat, k T = E F b would lead to T ∼ 105 K! Where did all this energy come from? The root of this energy reserve is the fermionic nature of the electron, and the Pauli exclusion principle. There is simply no classical explanation of this energy, it is of a pure quantum origin. We will shortly see that the very high conductivity of metals even at the lowest temperatures is a direct result of these internal quantum energy reserve. And the electrons at the highest energy are fast: the Fermi 7 1d velocity is v F = h¯mkeF = hn 4me ∼ 5 × 10 cm/s. This is a typical Fermi velocity in metals and degenerately doped semiconductors: a Fermi velocity of v F ∼ 108 cm/s is roughly v F ≈ c/300, that is, the highest energy electrons can be considered to be whizzing around at 1/300 times the speed of light even at T = 0 K due to the Pauli exclusion principle! The Fermi energy E1d ¯ 2 π 2 n21d /8me in Equation 5.21 is a F (0 K) = h characteristic value at T = 0 K, also called the chemical potential (see Exercise 4.4) with a symbol µ1d = E1d F ( T = 0 K). The carrier density at gs gv ×
11 If we pack even more fermions in
small volumes, the Fermi energy is much larger. For example, because of the tight packing of fermions – protons and neutrons – in the nucleus of an atom, the Fermi energy reaches ∼ 40 MeV – yes, Mega eV!
2k F
= N =⇒ k F =
5.3 Not so free: Particle on a ring 91
a temperature T depends on the Fermi level E1d F ( T ): n1d ( T ) =
gs gv L
∑ f (k) = k
gs gv L
Z +∞ dk −∞
2π L
1 1+e
h¯ 2 k2 − E1d ( T ) 2me F kb T
1021
.
(5.22)
2 2
The dimensionless variables u = ( h¯2mke )/k b T and η = E1d F ( T ) /k b T convert the carrier density at any temperature to 2πme k b T 1 E n1d ( T ) = gs gv ( ) 2 F− 1 ( F ) = Nc1d F− 1 (η ), 2 2 kb T 2 h | {z }
(5.23) 109
where we have made use of the definition of Fermi–Dirac integrals Fj (η ) from Chapter 4, Equation 4.25. We have collected the coefficient into Nc1d in a form that is called the band-edge density of states, a name that can (and will!) be justified only later for semiconductors. We will see shortly that the relation between the free electron carrier density and temperature in the most general form for d dimensions is
2
2πme k b T d )2 h2
(5.24)
is the band-edge DOS for d dimensions. The order of the Fermi–Dirac integral is (d − 2)/2, which is −1/2 for d = 1 or 1D, 0 for 2D, and 1/2 for 3D. Thus Equation 5.23 is the special case for 1D of Equation 5.24. Fig. 5.5 shows the value of the band edge density of states for free electrons for d = 1, 2, 3 for temperatures ranging from 1 − 1000 K. Though we write Ncd to avoid clutter, the temperature dependence should not be forgotten. At room temperature, the free electron bandedge DOS are: for 1D, Nc1d (300K) = 4.6 × 106 /cm, for 2D, Nc2d (300K) = 1013 /cm2 , and for 3D, Nc3d (300K) = 2.5 × 1019 /cm3 . These densities typically mark the crossover from classical behavior for nd > Ncd . Now if the total density of electrons n1d does not change12 with temperature, then the Fermi level E1d F ( T ) must adjust to ensure that n1d ( T = 0 K) = n1d ( T ) =⇒ n1d = Nc1d F− 1 (η ). 2
(5.25)
Solving the boxed equation gives the temperature dependence of the Fermi level E1d F ( T ) = η · k b T. No analytical expression is possible in 1D for the most general temperature. But if the temperature is such that k b T > +1, or the degenerate limit of the Fermi–Dirac integral given in Table 4.2, we obtain after some algebra, the relation E1d F ( T ) ≈ µ1d [1 +
π2 kb T 5 ( ) 2 ] for k b T 0. Most properties of this electron distribution are quantum mechanical and strongly affected by the exclusion principle. 12 The carrier density in a band is typ-
ically fixed for metals as a function of temperature, which is the subject of interest in this chapter. For semiconductors, due to the proximity to an energy gap, the density can change with temperature. This we will encounter in later chapters.
92 Electrons in the Quantum World 4 × 107
1
107
0.100
3 × 107 105
0.010
2 × 107 1000
0.001
1 × 107
10
10- 4
0 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0
0.1 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0
10- 5
-0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0
Fig. 5.6 Carrier statistics and energies for the 1D free electron gas. (a) The 1D electron density as a function of the Fermi level EF governed by n1d = Nc1D F−1/2 ( EF /k b T ) shown for various temperatures. For high carrier densities (the degenerate limit), the carrier density does not depend on T and depends on EF as a polynomial power. For low densities (the non-degenerate limit), the dependence on EF and T is e EF /kb T , i.e., exponential and very sensitive both to the Fermi level and the temperature. (b) Shows the dependence in the linear scale. It is important to note that though it appears so, n1d 6= 0 for EF +1, or for temperatures such that k b T +1 and ηd >> +1, ln(1 + eη ) ≈ η, and the quantum mechanical 1D current is Jd =
qgs gv q2 J q2 (k b T )(ηs − ηd ) = gs gv V =⇒ G1d = 1d = gs gv , 2π¯h h V h (5.40) where we find that the quantum conductance of a 1D wire depends only on the fundamental constants q, h, and does not depend on the J1d ≈
energy level Ec as the minimum allowed electron energy of the problem at the source injection point. For the free electron, Ec = 0, whereas in a metal or a semiconductor, it will become the bottom of the conduction band energy.
96 Electrons in the Quantum World
10 5 0 -5 -10
2
4
6
8
10
12
14
2
4
6
8
10
12
14
600 400 200 0
Fig. 5.8 Quantum current in a 1D conductor. (a) E(k ) figures show the changes in the corresponding occupied electron states for n1d = 5 × 107 /cm for three different voltages. (b) The right-going and left-going density of states, and how their occupation is changed with the applied voltage. (c) The calculated Fermi level EF = η F · k b T at V = 0 shown as a dashed line, and the split normalized Fermi levels EF = ηs · k b T and EFd = ηd · k b T for nonzero normalized voltages qV = vd · k b T for two values of 1D electron density at 300 K. The red curves are for n1d = 5 × 107 /cm, and the blue for n1d = 106 /cm. (d) The resulting quantum mechanical current flowing in response to the voltage for six values of 1D electron densities ranging from 0.1 − 5.0 × 107 /cm. For example, at a 1D electron density of n1d = 107 /cm, the maximum (or saturation) current is ∼ 70 µA.
length of the wire! For gs = 2 and gv = 1, the quantum of conductance G1d corresponds to a resistance 1/G1d ≈ 2qh2 ≈ 12, 950 Ω. This value of conductance is experimentally achieved at high carrier densities in the ballistic limit when there is no scattering of the electron states. This quantum value of conductance has been experimentally observed in nanowires, carbon nanotubes, and in ”break junctions”. The first surprise then is that the quantized conductance does not decrease with increasing length, which is the classical Ohm’s law, explained by the Drude model in Chapter 2. The classical resistance increases as the length, but the quantum current is ballistic: an electron wave incident from the source contact makes it all the way to the drain if there is no scattering. That scattering is necessary to explain the classical result of Ohm’s law will be discussed in later chapters. Evaluating the quantum current further for a range of 1D electron densities and voltages reveals further new surprises that are inconsistent with classical Drude-model type current flow, yet are consistent with what is experimentally measured for nanostructures. To calcu-
5.3 Not so free: Particle on a ring 97
late the total current, we use Equation 5.37, with Equation 5.34. In Fig. 5.8, panels (a) and (b) schematically show the occupation of the free electron energies in the k-space and the density of states picture. The right-going carriers are in equilibrium with the source and the left-going carriers in equilibrium with the drain. Panel (c) of Fig. 5.8 shows the calculated values of EFs = ηs · k b T and EFd = ηd · k b T for two starting 1D electron densities at T = 300 K. The split is equal to the applied voltage, as evaluated from Equation 5.34. At T = 300 K, and for a ”low” electron density n1d = 106 /cm shown in blue, η F is slightly negative. Upon application of the drain voltage, the two Fermi levels split, but ηs stays close to η F , whereas ηd is lowered by vd = qV/k b T. On the other hand, for a high 1D electron density of n1d = 5 × 107 /cm shown in red, η F ≈ +90 and splits initially nearly symmetrically into ηs and ηd around η F for low voltages. As the voltage is increased, the split becomes increasingly asymmetric, till ηd hits zero at vd = vsat . For larger voltages, ηs flattens, and ηd nosedives into the negative, picking up any increase in the voltage, all the while maintaining ηs − ηd = vd . Panel (d) of Fig. 5.8 shows the quantum or ballistic current calculated from Equation 5.37 for several electron densities. For each 1D electron density, there is a characteristic drain voltage Vsat beyond which the ηs becomes flat, and the ηd dives into the negative. The net current saturates15 at this voltage and higher, since almost all available electrons are moving to the right, as indicated in the E(k) diagram. When this happens, the 1D conductor is not able to provide any more current; any further increase in voltage goes into decreasing the left-going carriers, but there are almost none of them, and their contribution to the current is negligible. For voltages less than Vsat , the 1D ballistic conductance is the quantum of conductance 2q2 /h for all electron densities. An expression for Vsat may be obtained from a simple geometric analysis of Fig. 5.8. The onset of current saturation sat /q at the occurs when EFd = 0. Because EFs − EFd = qV, Vsat = EFs voltage when EFd = 0. Writing the Fermi level at V = 0 as EF , since all electrons are moving to the right at saturation, and in d-dimensions EF ∼ k2F ∼ n2/d d , sat EFd
= EF · 2
2 d
15 The saturation of current in a conduc-
tor is an extremely important characteristic for several applications. Saturation makes the output current independent of the voltage drop across the conductor, as long as it is larger than Vsat . This means that one conductor can be used to drive several similar conductors, and the voltage drop of these driven conductors may fluctuate and not affect the current in the driver. This is called fan-out, and is of utmost importance in electronic circuits.
2
=⇒ Vsat
E · 2d = F . q
(5.41)
For example, for 1D electrons of density n1d = 5 × 107 /cm, EF ∼ 2.4 eV, and the saturation voltage is Vsat = 22 · EF /q ∼ 9.4 Volt, consistent with the top curve in Fig. 5.8 (d). The saturation voltage decreases with the electron density monotonically16 . What is the meaning of a negative quasi Fermi level? For free electrons, the lowest allowed electron energy is E(k = 0) = 0, and cannot be negative. However, the Fermi level characterizing the Fermi–Dirac occupation function can be negative. For example, EFd characterizing the drain Fermi level, which is equal to the Fermi level of the left-going
16 The dependence of the saturation volt-
age on the electron densities in various 2
3 dimensions go as Vsat ∝ n21d , n12d , n3d .
98 Electrons in the Quantum World
17 We have discussed several aspects of
the quantum 1D conductor in detail. Our hard work will now pay off handsomely as we step to higher dimensions, because the 2D and 3D cases are qualitatively similar, except for some quantitative details. What is remarkable is that almost all the discussions and expressions we have described here for the free electrons carry over with minor changes to metals and semiconductors when electrons experience the periodic potential of atoms. So spend enough time understanding the physics of the 1D free electron gas, and re-read this section if some parts are not clear, till you are comfortable with both the qualitative and quantitative aspects of the 1D electron gas.
carriers can go below E(k ) = 0. In the real-space points along the conductor where this condition is met, there are very few left-going states filled, all electrons are essentially moving to the right. In later chapters, when we discuss not the free electron case, but electrons in semiconductors, the lowest allowed electron energy in the conduction band edge will be Ec , and the Fermi level will be measured with respect to this energy: ηd = ( EFd − Ec − qV )/k b T. This quantum J1d − V behavior of a 1D electron gas has three nonclassical features, not explainable by the Drude model, and are not consistent with Ohm’s law17 . They are: 1) In the linear regime, the 1D conductance is 2q2 /h, and does not depend on the length of the conductor. This we have pointed out earlier. 2) In addition to not depending on the length of the conductor, the 1D quantum conductance is also independent of the 1D electron density! In the Drude model, a higher electron density leads to a higher conductance, but not necessarily so in the quantum limit. 3) The fact that the quantum current saturates is non-classical, and not seen in Ohm’s law. The saturation current depends on the 1D electron density, even though the 1D conductance in the linear regime does not. So the quantum picture sets an upper limit on the current a 1D conductor can carry. Looking closer at the J1d vs vd plot in Fig. 5.8, we recognize that the current-voltage behavior looks very much like the output characteristics of a transistor. We realize that if by some means we could change the 1D electron density with an extra, third electrode, without drawing any current into this electrode, we can get all the curves of this plot with the same conductor. This indeed is the underlying working principle of a transistor, about which we discuss in later chapters.
5.4 The electron steps into a higher dimension: 2D Fig. 5.9 Periodic boundary conditions in 2D leads to a torus.
Wavefunctions and Momenta: Now let the electron move in two dimensions – say in a square box of side L and area A = L2 shown in Fig. 5.9. The spatial coordinate is now the 2D vector r = ( x, y), and the wavevector is k = (k x , k y ). Similar to the 1D case, by subjecting the electron to the Schrodinger equation in two dimensions with V (r) = 0, ¨ we find that the allowed wavefunctions are 1 1 ψ(r) = √ ei(k x x+ky y) = √ eik·r . 2 A L
(5.42)
If we paste the edges along x = 0 and x = L, and then the edges
5.4
The electron steps into a higher dimension: 2D 99
along y = 0 and y = L, the periodic boundary condition in real space and in k-space leads to a torus (Fig. 5.9). This requires eik x L = 1 and eiky L = 1, which yields the allowed wavevectors and momenta 2π hq 2 n x + n2y , (n x , ny ) =⇒ p = h¯ k, |p| = L L (5.43) where n x , ny are independent integers ..., −2, −1, 0, 1, 2, .... In the k space, 2 the allowed set of points form a rectangular grid, each of area ( 2π L ) . Each point in this grid defines an allowed state for electrons. This is shown in Fig. 5.10 (a). Energy and Density of States: The allowed energy of each such allowed k-point defines the paraboloid k = ( k n x , k ny ) =
h¯ 2 2 h2 h¯ 2 |k|2 (k nx + k2ny ) = E(n x , ny ) = (n2x + n2y ) = . 2 2me 2me 2me L (5.44) To find the 2D DOS g2d ( E), we try our intuition like in the 1D case, but we must be careful. Indeed the energies bunch up as we approach (k x , k y ) = (0, 0), but we must not forget that unlike the 1D case where there were mere two points of equal energy, we have an entire circle of equal-energy states in 2D as shown in Fig. 5.10 (a) on the surface of the paraboloid. In the 2D case, the number of states increases as k2 as the area of the k-space circle, as does the energy. Since the DOS is the number of states per unit energy, the two effects cancel out. The resulting 2D DOS is independent of electron energy: E(k x , k y ) =
2πkdk G ( E) gs gv m e = G2d ( E)dE =⇒ 2d 2 = g2d ( E) = Θ ( E ). 2π 2 L (L) 2π¯h2 (5.45) Here 2πkdk is the area of the thin ring of thickness dk around the circle of radius k as indicated in Fig. 5.10 (a). Because each state 2πkdk 2 occupies an area ( 2π L ) in the k-space, there are 2π 2 energy states in gs gv
(
L
)
the ring, from which we get the DOS g2d ( E). We also note that because for a free electron, gs = 2 and gv = 1, the 2D DOS is typically written as g2d ( E) = me2 for E > 0. The DOS is shown in Fig. 5.10 (b). π¯h The fact that the 2D DOS for a parabolic energy dispersion is a constant in energy plays a very important role in semiconductor fieldeffect transistors, where the conducting channel is a 2D electron gas. Moving to the many-electron picture in 2D, let us put N non interacting electrons in the area A so that their density per unit area is n2d = N/A. At T = 0 K, we apply the Pauli exclusion principle again and find that we must fill the states from the center k = (0, 0) to a sharply defined Fermi circle of radius k F given by gs gv
πk2F
2 ( 2π L )
= N =⇒ k F =
s
4πn2d . gs gv
(5.46)
Fig. 5.10 (a) Energy eigenvalues on a paraboloid E(k x , k y ) indicating the 2D grid in the k-space, and (b) the density of states (DOS) for free electrons in two dimensions.
100 Electrons in the Quantum World
√ For gs = 2 and gv = 1, the Fermi wavevector is k F = 2πn2d . In semiconductor field-effect transistors, typical sheet densities of 2D electron gases (2DEGs) are n2d ∼ 1012 /cm2 . The Fermi wavevector is then k F ∼ 2.5 × 108 /m, implying a Fermi wavelength of λ F = 2π k F ∼ 25 nm. On the other hand, for a 2D metal with n2d ∼ 1016 /cm2 , the Fermi wavelength is much shorter, λ F ∼ 0.25 nm, a typical interatomic distance in a metal or semiconductor crystal. At nonzero temperatures, the 2D electron density is
10
1
0.100
0.010
n2d = 0.001
gs gv L2
∑ f (k) = k
gs gv L2
Z ∞ 2πkdk 2 ( 2π L )
1 1+e
h¯ 2 k2 − E2d ( T ) 2me F kb T
.
(5.47)
2 2
10- 4 10 10
1011
1012
1013
1014
1015
Fig. 5.11 Dependence of the Fermi level on the density of free 2D electrons and on temperature. At high densities and low temperatures, EF ≈ n/( g2d ).
By making the change of variables u = h¯2mke /k b T and η = E2D F ( T ) /k b T and using the definition of the Fermi–Dirac integral F0 (η ) = ln(1 + eη ), we rewrite the temperature-dependent 2D electron density as n2d ( T ) = Nc2d · F0 (η ) =
gs gv m e
k T · ln(1 + e 2 b | 2π¯h{z }
E2d F (T ) kb T
),
(5.48)
Nc2d
where we have defined the 2D band-edge DOS Nc2d = Nc2d
gs gv ( 2πmhe2kb T ),
gs gv m e k T. 2π¯h2 b
Using
h¯ = h/2π and rewriting it as = we see that the free electron band-edge DOS and the electron density follows the dd dimensional rules that Ncd = gs gv ( 2πmhe2kb T ) 2 and nd = Ncd F d−2 (η ), with 2 d = 2, as was stated in Equation 5.24. For non-zero temperatures, the smearing of the Fermi–Dirac distribution near E = EF makes the Fermi circle diffuse. To get the electron density in a 2DEG at any temperature T, we use the 2D DOS and use the fact that the electron density does not change with temperature to get the Fermi level EF thus: n2d =
Z ∞ 0
EF
dE · g2d ( E) · f ( E) = Nc2d ln(1 + e kb T ) n2d
=⇒ EF = k b T ln(e Nc2d − 1).
(5.49)
If the temperature is low enough so that n2d >> Nc2d , the Fermi level is given simply by EF ∼ k b T · n2d2d ∼ ng2d , or n2d = g2d EF , where Nc 2d g2d = gs gv me 2 . The meaning of this relation: electron density = (DOS) 2π¯h × (Energy window). The dependence of the Fermi level on temperature and the 2D free electron density is shown in Fig. 5.11. As an example, typical 2DEGs in transistors have carrier concentrations n2d ∼ 1012 − 1013 /cm2 . At low temperatures, and even at room temperature, these 2DEGs are well in the quantum regime and the Fermi level has a weak dependence on temperature, if at all. Fig. 5.12 (a, b) show the dependence of the 2D electron gas density on the Fermi
5.4
The electron steps into a higher dimension: 2D 101
4 × 1014
1012
1
0.100
3 × 1014
0.010 108
2 × 10
14
0.001
1 × 1014
104
10- 4
1
-0.5
0.0
0.5
0 -1.0
1.0
-0.5
0.0
0.5
1.0
10- 5
-0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0
Fig. 5.12 Carrier statistics and energies for the 2D free electron gas. (a) The 2D electron density as a function of the Fermi level EF governed by n2d = Nc2d F0 ( EF /k b T ) shown for various temperatures. For high carrier densities (the degenerate limit), the carrier density does not depend on T and depends on EF as a polynomial power. For low densities (the non-degenerate limit), the dependence on EF and T is e EF /kb T , i.e., exponential and very sensitive both to the Fermi level and the temperature. (b) Shows the dependence in the linear scale. It is important to note that though it appears so, n2d 6= 0 for EF 0 are in equilibrium with the source Fermi level EFs , and the left-going electron states with k x < 0 are in equilibrium with the drain Fermi level EFd . (c) The calculated split in the quasi Fermi levels EFs and EFd for two 2D electron gas densities. The drain voltage at which EFs saturates is clearly identified around V = 2 Volt for n2d ∼ 5 × 1014 /cm2 , which is a metallic density, whereas is is very small for n2d ∼ 1013 /cm3 , closer to a 2D electron gas in a semiconductor nanostructure. (d) The ballistic 2D current for various 2D electron gas densities. The current is expected to saturate, at very high current densities, and at high voltages. This high current density is typically not achieved in 2D metals or semiconductors because of heat dissipation, which is why current saturation is typically not observed in 2D metallic conductors, only the linear portion is observed in experiments. However, saturation is clearly observed when the 2D electron gas density is low, in the 1013 /cm2 range.
V = 0, if the 2D electron density is fixed, the sum of the areas of the two split halves must remain the same as Fermi circle at V = 0. Since the area of the right half circle increases with V, there is a characteristic voltage at which the area of the right-going half circle will become equal to the area of the circle at V = 0 as shown in Fig. 5.14 (b). This will mark a crossover into current saturation19 . Let us now make the above observations quantitative. If the contacts are ohmic, the total 2D electron density at the source injection point does not change with the applied voltage. Then, we must have E 1 E − Ec E − Ec n2d = Nc2d · F0 ( F ) = Nc2d · [ F0 ( Fs ) + F0 ( Fd )], (5.54) kb T 2 kb T kb T which can now be solved with EFs − EFd = qV to obtain how the rightand left-going Fermi surfaces change with voltage. The calculated values are shown in Fig. 5.14 (c) for two representative electron densities.
19 The edge of the filled states define a
sharp circle only at T = 0 K. At finite temperatures, the edges are diffuse, but still maintain a circular shape.
104 Electrons in the Quantum World
The behavior is similar to the case discussed earlier for the 1D electron system, and so is the explanation. For a low 2D electron density n2d = 1013 /cm2 , EFs is close to zero, whereas EFd is qV below EFs . For a high density n2d = 5 × 1014 /cm2 , EFd increases with voltage and EFs decreases. At V ∼ 2.5 Volt, EFs = 0 and for any further increase in voltage, EFd clamps to its value at V ∼ 2.5 Volt and EFs decreases faster, accounting for all the voltage. This is the condition shown in panel (b) of the figure, when carriers fill the right-going half circle in the k-space. The electron group velocities point in various directions making angles − π2 ≤ θ ≤ + π2 , but they have k x > 0. Crudely, one can say that most electrons are ”moving to the right”, and the quantum current should saturate for higher voltages, as the occupation in the k-space does not change appreciably. To find the quantum current flowing to the right, we integrate for all k x > 0 by using q the x-component of the group velocity vg · xˆ = h¯ me k cos θ, R J2d =
k2x + k2y to obtain
where k =
Id qgs gv = W (2π )2
Z
π 2
Z ∞ ( h¯ k cos θ )kdkdθ m e
θ =− π2
k =0
where 0 J2d =
1+e
h¯ 2 k2 − E Fs 2me kb T
= J02d F 1 (ηs ), 2
q2 k T · Nc1d · b , h q
(5.55)
(5.56)
1
and Nc1d = gs gv ( 2πmhe2kb T ) 2 . The net ballistic current per unit width at a voltage V is then the difference between the right and left going electron currents 0 J2d = J2d · [ F1 ( 2
20 Chapter 17 discusses the mode ap-
proach in further detail.
EFs − Ec E − Ec − qV ) − F 1 ( Fs )]. 2 kb T kb T
(5.57)
This current per unit width is plotted in Fig. 5.14 (d) for various 2D electron densities ranging from n2d = 0.1 − 5 × 1014 /cm2 at T = 300 K. Such high saturation currents are typically not observed experimentally in 2D metals because heat dissipation limits the amount of current per unit width to < 10 mA/µm. But in 2D electron gases in semiconductors, the electron densities are in the n2d ∼ 1013 /cm2 range, and current densities in the few mA/µm are measured. This is the typical output current per unit width of a field effect transistor that uses a 2D electron gas as the conducting channel. A second point of observation is unlike the 1D case, the linear or ”ohmic” part of the current where J2d ∝ V is not the same for all 2D electron gas densities. From Fig. 5.14 (b), the 2D electron system may be considered a set of 1D conductors, each of a fixed k y . Then it is seen that the applied voltage changes the total number of the 1D modes, and the current may be also evaluated by summing the currents carried by each 1D mode20 . Using this procedure for evaluating the total current leads to the same result as in Equation 5.57. The
5.5
Electrons in a 3D box 105
voltage at which the current saturates increases linearly with the 2D electron density as Vsat ∝ n2d . The saturation current per unit width for the degenerate carrier con3
centration limit is obtained when EFs >> k b T implying F 1 (ηs ) ≈ 2
ηs2 Γ( 52 )
and V > Vsat which implies F 1 (ηd ) L2 .
(5.85) (5.86)
This a trap for the electron that extends over a length L and is V0 units deep in energy. Let us decrease the length L ↓ and make the well deeper V0 ↑, while keeping the product or ”strength” of the trap S = V0 L constant. In the limit L → 0, V0 → ∞, meaning the potential must become infinitely deep to keep S the same. This limit is the Dirac delta potential, written simply as V ( x ) = −Sδ( x ). For the 1D Dirac delta potential, there is only one bound state with energy eigenvalue and eigenfunction
Eδ = −
m e S2 2¯h
2
, and ψδ ( x ) =
s
me S h¯
2
e
− me2S | x | h¯
.
(5.87)
There are of course lots of propagating states with E > 0 that are scattered by the Dirac delta potential well. An important characteristic of the 1D Dirac delta potential well is that it admits exactly one bound state. One may naively think that if the strength of the potential well S is very shallow either due to V0 being small, or L being very narrow, the bound state is pushed out of the well. Or that there are more than one bound states for a stronger binding strength. But this is not correct in 1D – there is exactly one bound state no matter how weak or strong the binding strength. The Dirac delta potential well in 2D also allows exactly one bound state. However, unlike in 1D, the 2D bound state becomes exponentially shallow as the binding strength decreases to very small values. In a 3D Dirac delta potential well, if the binding strength is very weak, there may be no bound states at all.
Fig. 5.21 The 1D Dirac delta well.
...
114 Electrons in the Quantum World
Fig. 5.22 (a) The harmonic oscillator potential, and (b) the radially attractive Coulomb potential of a hydrogen atom. The energy eigenvalues for each case are indicated.
5.8 The harmonic oscillator Fig. 5.22 (a) shows a parabolic potential of a 1D harmonic oscillator V (x) =
1 me ω 2 x2 . 2
(5.88)
Solving the Schrodinger equation yields that the allowed electron en¨ ergy eigenvalues for this potential are equally spaced in energy: 1 En = (n + )h¯ ω, 2
23 The uniform spacing is a very good
model for phonons and photons, which are wave phenomena, as will be described in later chapters.
(5.89)
where ω is a measure of the curvature of the parabolic potential, and n = 0, 1, 2, .... The ground state energy 12 h¯ ω like the infinite square well potential is nonzero. The spacing of energies is exactly h¯ ω. Unlike the infinite square well potential, in which the eigenvalues increase as En ∼ n2 , they increase linearly as En ∼ n for the oscillator23 . The wavefunctions are Hermite polynomials as indicated in Fig. 5.22 (a).
5.9 The hydrogen atom The hydrogen atom has one proton and one electron. Fig. 5.22 (b) shows that the Coulomb potential energy of an electron at a distance r from the proton is q2 V (r ) = − . (5.90) 4πe0 r The 3D Schrodinger equation for the electron in this potential ¨
−
h¯ 2 2 (∂ + ∂2y + ∂2z )ψ( x, y, z) + V (r )ψ( x, y, z) = Eψ( x, y, z) 2me x
(5.91)
is solved in the spherical coordinate system by exploiting the symmetry of the potential. This was done famously for the first time by
5.10
Schrodinger with help from a mathematician, and gained his equation ¨ instant popularity and fame. The energy eigenvalues are En = −
m e q4 2(4πe0
)2 h¯ 2
1 13.6 = − 2 eV, 2 n n
(5.92)
with corresponding eigenfunctions ψ(r, θ, φ) = R(r )Θ(θ )Φ(φ) that have four quantum numbers: (n, m, l, s) which capture the radial part, the angular momentum of electrons around the proton, and the spin of the electron. The eigenvalues above are shown in Fig. 5.22 (b). They are exactly the same as Bohr had obtained from this orbit theory as was discussed in Chapter 3. But Schrodinger’s equation for the first time ¨ gave a firm mathematical basis for the spectrum of the hydrogen atom, and opened up the path for understanding complicated molecules and solids. The few bound state problems discussed above have exact analytical solutions of the electron eigenvalues and eigenfunctions. There are few more potentials for which exact solutions may be obtained, such as a half-parabolic harmonic oscillator which allows only the even eigenvalues of the full harmonic oscillator, or for which approximate analytical solutions may be obtained, such as the triangular quantum well which leads to Airy function solutions. With the solutions to the bound states of the electron for these simple potentials, we are well armed to handle the several classes of problems that we will encounter in the later chapters in this book. If more complicated potentials are encountered, the Schrodinger equation can be solved approximately ¨ by first finding the exactly solved potential problem that is closest, and then applying perturbation theory, which is the subject of the next two chapters.
5.10 Chapter summary section In this chapter, we learned:
• The momentum, eigenfunction, eigenvalue spectra of electrons free to move in one, two, and three spatial dimensions, • The thermodynamic properties such as average energy and specific heat of free electrons in one, two, and three dimensions, and how the fact that the electronic part of the specific heat is lowered significantly by enforcing the Pauli exclusion principle, • The quantum mechanical current flowing due to free electrons in 1D, 2D, and 3D conductors upon the application of a voltage across contact electrodes. The current exhibits a characteristic linear regime at small voltages that looks like Ohm’s law but differs in several ways from the classical picture, and current saturation at high voltages that has no classical analog, and
Chapter summary section 115
116 Exercises
• The eigenvalues and eigenfunctions of a few exactly solvable bound-state problems for the electron.
Further reading The free electron, and various bound state problems are covered well in various textbooks of quantum mechanics, of which Introduction to Quantum Mechanics by D. J. Griffiths, and Quantum Mechanics by C. CohenTannoudji and B. Diu are recommended. The quantum mechanics and quantum thermodynamics of the 3D free
electron problem is discussed in significant detail and with insight in the introductory chapters of Solid State Physics by N. Ashcroft and D. Mermin. The quantum transport problem is presented under the umbrella of semiconductor nanostructures and devices in Nanoscale Transistors by M. Lundstrom and J. Guo.
Exercises (5.1) Born to be free As discussed in this chapter, because of the Pauli exclusion principle, fermions must follow the Fermi–Dirac distribution, and they have halfinteger spins. Now imagine we have a metal with n = 1023 /cm3 electrons in a cubic box of side L, and we know that electrons are fermions. Assume the electrons are completely free to move around in the box, meaning there are no atoms in their way. If that that much freedom is not enough for you, how about this: completely neglect the Coulomb interactions due the charge of the electrons! (This may all be very unsettling, but we will explain later why it is actually OK to do so – because with great freedom comes great responsibility! In fact this problem could be formulated for any fermion – for example the uncharged neutron – and the analytical answers will be the same.) Find the following at T = 0K: (a) The Fermi wavevector k F . (b) The Fermi momentum p F . (c) The Fermi energy EF .
(d) The average energy of electrons u = the origin of this energy?
U N.
What is
(e) What is the average energy of the electrons if they did not follow quantum mechanics, but were subject to classical mechanics? (5.2) Graphene DOS and Fermi–Dirac distribution The electrons in the conduction band of graphene are free to move in 2-dimensions, forming a 2DEG. The energy-momentum dispersion relationship for theq 2DEG electrons in graphene is E(k x , k y ) = h¯ v F k2x + k2y , where v F is a parameter with dimensions of velocity. For graphene, it is v F = 108 cm/s.
(a) Make a sketch of the energy as a function of the (k x , k y ) points in the 2D k-space plane, and show that the dispersion results in a conical shape. (b) Show that the density of states for these elecg g trons is g( E) = 2π (sh¯ vv )2 | E|, where gs = 2 is the F spin degeneracy of each (k x , k y ) state, and gv is the number of cones in the energy dispersion. For graphene, gv = 2.
Exercises 117
(c) Show that at thermal equilibrium, when the Fermi level is at E f = 0, the number of conduction electrons per unit area in 2D graphene is 2 ni = π6 ( h¯kT v F ) . Make a plot of this density as a function of temperature for 0K ≤ T ≤ 500K. Explain why your plot sets the bar on the lowest possible density of carriers achievable in graphene at those temperatures.
Fig. 5.23 Andre Geim and Kostya Novoselov were awarded the Nobel Prize in physics in 2010 for the discovery of graphene, the thinnest 2D crystal with remarkable electron transport properties.
(5.3) Density of states of electrons, photons, and phonons (a) Show that for a parabolic bandstructure for elec2 2 trons E(k) = Ec + h¯2mk? with band edge Ec and ef? fective mass m , the DOS for electron motion in d dimensions is gd ( E ) =
gs gv d
2d π 2 Γ( d2 )
(
2m? h¯ 2
d 2
) ( E − Ec )
d 2 −1
,
(5.93)
where gs is the spin degeneracy, and gv is the valley degeneracy. Here Γ(...) is the Gamma function √ with property Γ( x + 1) = xΓ( x ) and Γ( 12 ) = π. You may need the expression for the surface area of a d-dimensional sphere in k-space: Sd =
d
2π 2 kd−1 . Γ( d2 )
Check that this reduces to the surface area of a sphere for d = 3 and the circumference of a circle for d = 2. (b) Sketch the DOS for 3D, 2D, and 1D electron systems using the expression. Explain the roles of the valley degeneracy and the effective mass for silicon and compound semiconductors. (c) Show that the DOS for energy dispersion E(k) = h¯ vk for three dimensions is gω ( ω ) =
g p ω2 , 2π 2 h¯ v3
(5.94)
where ω = vk, and g p is the polarization degeneracy. This is the dispersion for waves, such as photons and phonons moving with velocity v. The parabolic DOS of phonons and photons will play an important role in the thermal and photonic properties of semiconductors. (5.4) Periodic vs. hard-wall boundary conditions Instead of the periodic boundary conditions that require that the wavefunction at the ”ends” of a region in space just be equal (e.g. ψ( x ) = ψ( x + L)), one may enforce what are called hard-wall boundary conditions for the 1D, 2D, or 3D electron gas. The hard-wall boundary condition requires that the wavefunction vanish at the edges, for example for the 1D electron gas at x = 0 and at x = L. Show that the allowed k-states are then rearranged: instead of the spacing for periodic boundary condition k n = 2π L n where n = 0, ±1, ±2, ..., the hard wall states are spaced closer: k n = πL n where n = 1, 2, 3, .... Show that though the k-space spacing changes, the density of states remains the same. Show that for the 2D case, instead of k-states allowed for the hard-wall boundary condition occupies only the first quadrant of the k-space, but again with the same DOS. Similarly, show that for the 3D free electron gas, only the first octant of the 3D k-space hosts the allowed k-states, and the DOS is the same as that worked out for the periodic boundary condition. (5.5) Sommerfeld’s coup (a) Using the DOS you calculated in Exercise 5.3, find the total energy of N electrons in volume V at T = 0 K for 3D, 2D, and 1D electron gases with parabolic energy dispersion. Note that you already solved the 3D electron gas problem in Exercise 5.1. (b) Now for the heat capacity cv = V1 dU dT , we need to find the total energy U at a non-zero temperature T. To do that, you can still use the fact that heating a bunch of electrons will not increase or decrease their number. Show (this is somewhat hard, but Sommerfeld did it ∼100 years ago!) that for 3D electrons, the Fermi energy changes with temperature as 1 πk b T 2 EF ( T ) = EF (0)[1 − ( ) ]. 3 2EF (0)
(5.95)
(c) Show that the heat capacity of 3D ”quantum” electrons is then cv =
π2 k T nk b b . 2 E F (0)
(5.96)
118 Exercises (d) By comparing this form of the electron heat capacity with Drude’s result cv = 32 nk b , can you explain why the heat capacity of the ”quantum” electrons is so much smaller than the ”classical” electrons? (5.6) Electric current flow in semiconductor crystals
then the total conductance is NG0 . This quantization is indeed measured experimentally, as seen in Fig. 5.24. (b) Electrons sit in the nz state of a heterostructure 2D quantum well of length Lz in the z-direction and infinite depth, and are free to move in an area L x Ly in the x − y directions in the well. The enh¯ 2 (k2 +k2 )
x y ergy bandstructure is E(k x , k y ) = . Show 2m? that the probability current density for state |ki = (k x , k y , k nz ) is the following:
1 h¯ 2 ·[ (k x xˆ + k y yˆ )] · sin2 (k nz z). L x Ly m? Lz (5.97) (c) Provide an expression for k nz and explain the result. Integrate the z-component to show that the 2D probability current is in the form j2d (k) = L1d v g (k), where v g (k) = 1h¯ ∇k E(k) is the group velocity. This is a more general result that applies also for particles that may appear ”massless”. j ( k x , k y , k nz ) =
Fig. 5.24 Experimental evidence of conductance quantization measured in an AlGaAs/GaAs quantum well structure with N parallel quantized conductors from Physical Review Letters, Vol. 60, Page 848 (1998). The number of 1D conductors or subbands could be tuned by a gate voltage. We will encounter this figure again in Chapter 17, Fig. 17.4.
We have discussed that the quantum mechanical current carried by electron states is found easily by using the concept of the group velocity which is obtained directly from the bandstructure E(k). The second important concept of current flow is how the k-states are filled/emptied by metal contacts which are reservoirs of electrons, and determine the quasi-Fermi levels of the |ki states in the semiconductor. In this problem, you will gain practice in applying these concepts to find the quantum mechanical electric current in various situations. (a) Show that the ballistic conductance G = I/V in any 1D metal or semiconductor at a low voltage V and at a low temperature qV > h¯ v F k0 . Actually this is not fictitious any more - recently 2D crystal semiconductors with a broken symmetry in the direction perpendicular to the 2D plane have been discovered such as silicene and germanene atomicaly thin versions of silicon and germanium that have this sort of a bandstructure. They have rather interesting toplogical properties. (5.8) Been there, done that: ballistic current flow and current saturation We discussed the carrier density and the ballistic current flow for transport in d = 1, 2, 3 dimensions in this chapter. This problem is for you to practice the self-consistent calculations, understand current saturation, and connect the free-electron model to the case of ballistic transport in semiconductors semiconductors. (a) Consider the 1D case, encountered in semiconductor quantum wires and carbon nanotubes. Use a conduction band effective mass of m?c = 0.2me , valley degeneracy gv = 1 and spin degeneracy gs = 2. Calculate and plot the source quasi Fermi level at the source injection point ( EFs − Ec ), and the drain quasi Fermi level ( EFs − qV − Ec ) as a function of the voltage 0 < V < 2 Volt for a high
electron density n1d = 5 × 106 /cm at room temperature T = 300 K, and explain the plot. Next, plot the ballistic currents vs voltage for electron densities n1d = 1, 2, 3, 4, 5 × 106 /cm, both at 300 K and at 77 K for 0 < V < 2 Volt. You should observe that in 1D semiconductors the ballistic current does not depend on the 1D density at low voltages and low temperatures – why is this so? Why does the current saturate at high voltages? (b) Now consider the 2D case, which is encountered in silicon transistors, III-V quantum well high-electron mobility transistors, and 2D crystal semiconductors and metals. For this problem, use a conduction band effective mass of m?c = 0.2me , valley degeneracy gv = 1 and spin degeneracy gs = 2. Calculate and plot the source quasi Fermi level at the source injection point ( EFs − Ec ), and the drain quasi Fermi level ( EFs − qV − Ec ) as a function of the voltage 0 < V < 2 Volt for a high electron density n2d = 5 × 1013 /cm2 at room temperature T = 300 K, and explain the plot. Next, plot the ballistic current per unit width vs. voltage for electron densities n1d = 1, 2, 3, 4, 5 × 1013 /cm2 , both at 300 K and at 77 K for 0 < V < 2 Volt. You should observe that unlike in 1D semiconductors, the ballistic current per unit width in 2D does depend on the 2D density at low voltages and low temperatures – why is this so? Why does the current saturate at high voltages? In this problem, you have solved the 1D and 2D ballistic transistor problem in disguise. The variation of the carrier density in a transistor is done with a third terminal called the gate, which is the topic of Chapter 18: that is the only new thing you will do there, the current is already covered in this chapter. (5.9) Velocity saturation in nanotubes and semiconductors (a) If high energy electrons collide with the lattice and emit optical phonons at a very fast rate, they come to equilibrium with the lattice rather than the source and drain electrodes. Assume we have a metallic carbon nanotube that has a 1D energy dispersion E(k) = h¯ v F |k | with a spin degeneracy of gs = 2 and a valley degeneracy gv = 2. Show that if the optical phonon energy is h¯ ωop and the above ultrafast optical phonon emission occurs, then the saturation current in the nanotube qg g ω is given by Isat = s 2πv op . Find the magnitude of this current for h¯ ωop ∼ 160 meV, and compare with
120 Exercises experimental data (give references). (b) At low electric fields, the velocity v of electrons in a semiconductor increases linearly with the field F according to v = µF, where µ = qτm /m? is the mobility, τm is the momentum scattering time and m? the electron effective mass. But when the electric field is cranked up, the electron velocity saturates, because the electrons emit optical phonons each of energy h¯ ωop every τE seconds, dumping the energy qFv they gain from the electric field every second. Setting up the equations for the conservation of momentum and energy, and solving for the steady state yields an estimate of this saturation velocity. Show that the saturation velocity obtained q q h¯ ωop τm by this scheme is vsat = · m? τE . Show that for a typical semiconductor for which h¯ ωop ≈ 60 meV, m? ∼ 0.2me , and τm ∼ τE , the electron saturation velocity is of the order of ∼ 107 cm/s. This is a good rough number for the saturation velocity of most semiconductors. (5.10) Fermi energy, average energy, and conductivity of graphene The energy-momentum dispersion relationship for electrons in 2-dimensional graphene is E(k x , k y ) = q h¯ v F k2x + k2y , where v F = 106 m/s is a parameter with dimensions of velocity. Each allowed k-state holds gs = 2 spins, and there are gv = 2 valleys. Using this energy dispersion, the density of states for graphene is found (see Exercise 5.2) to be g g g( E) = 2π (sh¯ vv )2 | E|. F
(a) If the sheet of graphene has n2d = 1013 /cm2 electrons, find the Fermi wavevector k F at T = 0 K. (b) Find the Fermi energy EF at T = 0 K. Express its numerical value in eV units. (c) Find the average energy of the electron distribution at T = 0 K. (d) Find an expression for the group velocity of electrons occupying the state (k x , k y ). Remember that the group velocity is a vector in 2D. (e) Make a sketch of the direction and magnitude of the group velocity of a few (k x , k y ) points using part (d). How is this case different from a freeelectron? (f) When I apply an electric field, a current will
flow in graphene. Sketch how the k-space picture changes from the equilibrium value to support the flow of current. (5.11) Semiconductor density of states Electrons in a very narrow bandgap 3D semiconductor are found q to have a bandstructure E(k x , k y , k z ) = h¯ v F k2x + k2y + k2z , where h¯ is the
reduced Planck’s constant, and v F = 108 cm/s is a characteristic Fermi velocity. Assume the spin degeneracy of each k-state to be gs = 2 and the valley degeneracy to be gv = 1 for numerical evaluations, but first give the analytical answers. (a) Show that the DOS per unit volume is g( E) = gs gv E2 . 2π 2 (h¯ v )3 F
(b) Sketch the DOS and a Fermi occupation function with the Fermi level at E = 0, and qualitatively indicate in the sketch how the occupied electron states would change with temperature. (c) The electron Rdensity at a temperature T is given by n = dE · g( E) · f ( E), where f ( E) is the Fermi–Dirac function characterized by a Fermi level EF . Assume that the Fermi level is at EF = 0 eV. Show that at temperature T, there are 3ζ (3) gs gv k b T 3 n3d = 2 2π 2 (h ¯ v F ) electrons occupied. Check the units. In other words, show that the electron density increases as T 3 . Find the 3D electron density at room temperature, T = 300 K. [ You are R∞ x2 3 given that 0 dx 1+ e x = 2 ζ (3), where ζ (...) is the zeta function, and ζ (3) ≈ 1.2.] (d) Indicate the group velocity vectors v g (k) = 1 h¯ ∇k E ( k ) of the k-states and show that they have the same magnitude at all points in the k-space, pointing radially outwards from the Γ-point, which is k = (0, 0, 0). Discuss why the velocity vector is not well defined only at the origin. (5.12) Quantum 2D free electrons in a magnetic field This problem is kindly provided by Prof. Farhan Rana. Consider a 2D free electron gas confined to the x-y plane. In the Sommerfeld model, the energy of an electron with wavevector k is E(k) = h¯ 2 k2 1 h¯ k 2me , and the velocity is v ( k ) = h¯ ∇k E ( k ) = me . Now suppose a DC magnetic field B = B0 z is switched on in the z-direction, as shown in Fig. 5.25 (a). In the presence of the magnetic field, because of the Lorentz force, the momentum of the electron satisfies the equation (assuming no electric field and no scattering)
Exercises 121
dk , (5.98) dt which is the quantum version of Newton’s law, with the Lorentz force. qv(k) × B = h¯
change) during its motion. Hint: The proof is just two lines of math. (e) If instead of one electron, there were many, before the magnetic field was switched the total current carrier by the electron gas (summing up contributions from all electrons) was given by J = 2q
Z
d2 k f (k)v(k) = 0, (2π )2
(5.99)
where f (k) was the equilibrium Fermi–Dirac distribution for electrons. Find the total current carried by the electron gas after the magnetic field has been switched on and explain your answer. (5.13) Particle in a box The particle in a box problem in Section 5.6 of infinitely tall barriers is a powerful tool to estimate using back of the envelope techniques several quantum length, velocity, and energy scales. You can get a flavor for some by elementary analysis. 2 2
Fig. 5.25 (a) 2D electron in a perpendicular magnetic field. (b) 2D electron k-space. (c) 2D electron real space.
(a) In the k-space, if the starting position of the electron (before the magnetic field was switched on) is at (k x0 , 0)) as shown in Fig. 5.25 (b), then find the trajectory of the electron in the k-space after the magnetic field has been switched on. Plot the trajectory in the k-space. (b) Continuation of part (a): If in addition, the starting position of the electron (before the magnetic field was switched on) in real space is at ( x0 , 0) as shown in Fig. 5.25 (c), then find the trajectory of the electron in real-space after the magnetic field has been switched on and plot it in the real space. (c) If you did parts (a) and (b) correctly, you would have found that the motion of electron in both k-space and real space is periodic. Find the time period for the motion (i.e. the time taken by the electron to complete one period). (d) Staring from the Equation 5.98, prove that the energy of the electron is conserved (i.e. does not
n h (a) The energy of the nth eigenstate is En = 8mL 2 for a particle of mass m in a quantum well of size L. Imagine a semi-classical situation in which instead of of quantum confinement, the discrete energies are because the particle in the nth state is moving nh with a velocity vn . Show that vn = 2mL .
(b) Assume an electron of mass me = 9.1 × 10−31 ˚ representative of kg in a well of size L0 = 2 A, typical atomic radii. Show that the n = 1 ground state velocity is 1.6 × 106 m/s, and energy is 9.4 eV, fairly representative of the scale of electron ionization energies from atoms and solids. (c) Now consider a proton of mass m p ∼ 2000me , 2000 times heavier the electron, quantum-confined in the ground state inside the nucleus, to a disL0 tance: L = 20,000 ∼ 10−14 m, which is 20,000 times smaller than interatomic distances. Show that its effective velocity is much faster than the electron in part (b) at 1.8 × 107 m/s, and energy much higher at 1.9 MeV. (d) The examples in (b) and (c) are not that artificial: alpha particles ejected from the nucleus of unstable atoms do move at these velocities. The high energy is representative of nuclear reactions which occur at MeVs scales of part (c), compared to the much lower energy scales of chemical reactions, which are in a few eV due to the electronic energies of part (b).
Red or Blue Pill: Befriending the Matrix Prior to Schrodinger’s differential equation form of ”wave-mechanics” ¨ for finding the allowed quantum states of electrons, Heisenberg, Born, and Jordan (Fig. 6.2) had developed the first complete form of quantum mechanics, but in the form of matrices. They had named it matrix mechanics. With the goal to explain the experimentally observed sharp and discrete spectral lines of the hydrogen atom, Heisenberg hit upon the crucial idea that if the dynamical variables of the electron such as its location [ x ] and its momentum [ p] were matrices instead of numbers, then its energy would be a found from a matrix eigenvalue equation, which can yield discrete transition energies. Today we all know that matrices can have discrete eigenvalues, but this connection was not clear in the 1920s when matrices were rarely used in physics. John von Neumann, who was David Hilbert’s (Fig. 6.1) student, later proved the equivalence of Heisenberg’s matrix mechanics, and Schrodinger’s wave mechanics. ¨ In this chapter, we solidify our notion of why matrices present a natural and unified approach to obtain quantitative solutions to the quantum mechanical problem of electrons subjected to simple and increasingly complicated potentials. The goals of this chapter are to understand:
• How does the expansion principle connect linear algebraic equations to matrices? • What is the connection between matrices and quantum mechanical eigenstates, eigenvalues, superpositions, and operators? • What are some special properties of matrices and how do we evaluate them in practice? In Chapter 5, we became acquainted with the wave-mechanics method of Schrodinger and applied it to the free electron in various dimen¨ sions, and a few other problems. In this chapter, we befriend the matrix method of solving for the quantum mechanical states and energies of electrons. For most numerical solutions, this is the method of choice. With further judicious choice, the matrix equation can give analytical solutions, as we will see in several following chapters for the electron bandstructure in periodic potentials, the situation encountered most
6 6.1 The expansion principle
124
6.2 Matrix mechanics
125
6.3 Matrices and algebraic functions 127 6.4 Properties of matrix eigenvlaues 131 6.5 Looking ahead
131
6.6 Chapter summary section
132
Further reading
132
Exercises
133
Fig. 6.1 David Hilbert, a German mathematician in Gottingen who among many other subjects, developed the idea of Hilbert spaces, which are infinite dimensional matrices with special significance in quantum mechanics. In a delightful story which remains to be confirmed, Hilbert had promised a seminar in the USA on the solution of the famous Fermat’s last theorem, to which Fermat had claimed he had a proof but the margin was too small to hold it. The packed audience was disappointed that his seminar had nothing to do with Fermat’s last theorem. When asked, Hilbert replied his seminar title was just in case his cross-Atlantic plane crashed.
124 Red or Blue Pill: Befriending the Matrix
often for semiconductors. We first motivate matrix mechanics by discussing one of the most important and least emphasized principles of quantum mechanics.
6.1 The expansion principle Fourier’s theorem mathematically guarantees that any well-behaved function f ( x ) can be expressed as a sum over a complete set of trigonometric functions (or complex exponentials): f ( x ) = ∑k ak eikx . Note that any complete set of eigenfunctions [ . . . , eikx , . . .] works! This set has an infinite number of elements and is called the Hilbert space. In practice, we typically use a restricted set for most problems. To find the Fourier coefficients, ”filtering” property of comR we use−the plex exponentials akn = dx f ( x )e ikn x . If we tweak the function f ( x ) → f ( x ) + g( x ) = h( x ), then h( x ) = ∑k a0 k eikx is still a valid expansion; the Fourier coefficients will be tweaked from ak → a0 k . But note that the perturbed function can still be expanded in terms of the original complete set of eigenfunctions. This idea leads to the expansion principle in quantum mechanics. Here is the expansion principle of quantum mechanics: Any quantum state ”vector” |Ψi may be expanded as a linear superposition of the eigenvectors of any Hermitian operator |Ψi = ∑n an |ni. For most problems, the Hermitian operator of choice is the Hamiltonian operapˆ 2 tor Hˆ = 2m0 + V (r), but it need not be. We choose the Hamiltonian operator since there exist a few problems that we encountered in Chapter 5 for which we know the set of exact eigenvectors [. . . , |n − 1i, |ni, |n + 1i, . . .]. These sets of eigenvectors are complete. We also discussed in Chapter 5 that this choice of eigenstates are stationary. For example, we pˆ 2 found the 1D electron on a ring problem with Hˆ = 2m0 : gave eigen2 2
Fig. 6.2 Pascual Jordan, who with Heisenberg and Max Born in Gottingen ¨ developed the first consistent version of quantum mechanics in its matrix form, and named it matrix mechanics. Jordan also did seminal work on quantum field theory. Had he not associated himself with the Nazi party, he likely would have been recognized today as well as Heisenberg and Born.
values E(k) = h¯2mk0 , and corresponding eigenvectors |kFE i projected to real space h x |kFE i = √1 eikx . This eigenstate basis [. . . , eikx , . . .] is L complete, where k takes all allowed values. Now consider any state of a harmonic oscillator |ψHO i. The expansion principle guarantees that we can expand any harmonic oscillator state in the basis of the free electron |ψHO i = ∑k ak |kFE i. We can do the reverse too: expand the free electron states in terms of the harmonic oscillator. This is allowed as long as the potential term in the Hermitian operator does not blow up. For example, we can expand the particle in a box states in terms of the free electron states |ψbox i = ∑ ak |k FE i, but not the other way around because the particle in a box potential blows up outside the box. This should be obvious because the eigenfunctions of the particle in a box are all zero outside the box, and no matter how clever one is, it is not possible to linearly combine zeroes to produce a function that takes non-zero values outside the box. The expansion principle is the backbone of perturbation theory, which underpins the actual usage of quantum mechanics in semicon-
6.2 Matrix mechanics 125
ductor physics. In this chapter, we set up the framework for using perturbation theory by describing the matrix representation of quantum mechanics.
6.2 Matrix mechanics Since we can express any quantum state as an expansion in the eigenvectors |Ψi = ∑n an |ni, we can arrange the expansion coefficients as a column vector a1 h1| Ψ i a2 h2| Ψ i | Ψ i = a3 = h3| Ψ i . (6.1) .. .. . .
The Hermitian conjugate is obtained by transposing and taking term-by-term complex conjugation: hΨ| = a1? a2? a3? · · · , (6.2)
which is a row vector. If the state |Ψi is normalized, then clearly hΨ|Ψi = 1, which requires ∑n | an |2 = 1. Upon measurement, the state will always materialize one of the eigenstates |ni. Then | an |2 should be interpreted as the probability the quantum state materializes in state |ni, and an is the corresponding probability amplitude. The normalization condition ∑n | an |2 = 1 makes sure that the probabilities add up to one, and the particles are neither created nor destroyed, but their total number stays fixed. Also, the state of eigenvectors |ni can always be chosen to be mutually orthogonal, i. e., hm|ni = δmn . This is the basis we will work with. Also note that projecting |Ψi = ∑n an |ni on state hn| yields the coefficients an = hn|Ψi for each n. Then, we can write the expansion as |Ψi = ∑n |nihn|Ψi, which means that
∑ |nihn| = 1.
(6.3)
n
This is the ”closure” relation of eigenvectors that are discrete. If the eigenvectors were continuous, then the corresponding closure relation is Z dx | x ih x | = 1. (6.4)
The fact that the two closure relations are unity allows us to insert them wherever they will help in the evaluation of matrix elements. Consider now an operator Aˆ acting on the state vector |Ψi. It will try to ”rotate” the state vector in the Hilbert space to a state |Φi as shown pictorially in Fig. 6.3. We write this as Aˆ |Ψi = |Φi.
(6.5)
By the expansion principle, we can expand the new state |Φi = ∑m bm |mi. Then, if we project this state on |mi, we have bm = hm|Φi = hm| Aˆ |Ψi → bm = ∑ an hm| Aˆ |ni = ∑ Amn an . (6.6) n
n
Fig. 6.3 Three ways of saying the same thing. The operator Aˆ rotates a state vector |Ψi into |Φi. The pictorial depiction is equivalent to the algebraic operator equation, which in turn is equivalent to the matrix form [ A][Ψ] = [Φ].
126 Red or Blue Pill: Befriending the Matrix
We see that the operator Aˆ is equivalent to a matrix Aˆ ≡ Amn = [ A]. The elements of the equivalent matrix are the terms Amn = hm| Aˆ |ni, obtained by the operator acting on eigenstates on both sides. We call them matrix elements for obvious reasons. For example, if we choose the momentum operator acting between states |k i and hk0 | of the free electron, we get pkk0 = hk0 | pˆ |ki =
1 This statement may be confusing: if |Ψi
is not an eigenstate, then Hˆ |Ψi = |Φi, |Φi 6= constant·|Ψi, and we cannot have Hˆ |Ψi = E|Ψi, where E is a single number. This is one of the principal axioms of quantum mechanics! Note however, what is meant by the statement is that the energy E is such that it is a solution to the equation Hˆ |Ψi = E|Ψi – not all E will do, but only special ones. These special values of E in the equation Hˆ |Ψi = E|Ψi are the eigenvalues, and are what are sought for in the matrix method. The formulation is really meant for perturbation theory.
Z
dx hk0 | x ih x | pˆ |ki = h¯ kδk0 ,k .
(6.7)
Note that the ”abstract” operator pˆ has the matrix representation ∂ h x | pˆ | x i = −i¯h ∂x in real space. The example shows that since the free-electron energy eigenstates are simultaneously momentum eigenstates, the momentum operator acting between two eigenstates extracts the value of the momentum only if the two states are identical. This is the momentum matrix element. One of the most important operators is the Hamiltonian operator, which ”extracts” the energy of the state it is acting on. If the state |ni happens to be an energy eigenstate, the Hamiltonian operator extracts its energy eigenvalue: Hˆ |ni = En |ni. Visualize Hˆ |ni as a new vector whose ”direction” is the same as the eigenvector |ni, but the length determined by the eigenvalue En . So the action of the Hamiltonian operator leaves the ”direction” of the eigenvector |ni unaffected. If the state is not an energy eigenstate but is a linear superposition |Ψi = ∑n an |ni, then the time-independent Schrodinger equa¨ tion states that Hˆ |Ψi = E|Ψi, which is equivalent1 to Hˆ ∑n an |ni = E ∑n an |ni. When we project this new state vector Hˆ |Ψi on the eigenvector hm|, we get an algebraic equation
∑hm| Hˆ |nian = Eam ,
(6.8)
n
for each m. Note the appearance of the matrix elements Hmn = hm| Hˆ |ni. If we write this out for m = 1, 2, . . ., we get the set of linear equations H11 a1 + H12 a2 + H13 a3 . . . H21 a1 + H22 a2 + H23 a3 . . . H31 a1 + H32 a2 + H33 a3 . . . .. .
= Ea1 = Ea2 = Ea3 . = ..
which is best captured as a matrix equation H11 H12 H13 . . . a1 a1 H21 H22 H23 . . . a2 a2 H31 H32 H33 . . . a3 = E a3 .. .. .. .. .. .. . . . . . .
(6.9)
.
(6.10)
Note that the Hamiltonian operator becomes a square matrix, and the state |Ψi becomes a column vector. This matrix equation contains ¨ the same information as the algebraic time-independent Schrodinger
6.3 Matrices and algebraic functions 127
Fig. 6.4 Examples of 2x2, 3x3, and 5x5 Matrix eigenvalue and eigenfunction calculations in Mathematica. The 2x2 Hamiltonian is general and one of the most important in all of quantum mechanics. The 3x3 matrix is a numerical example, and the 5x5 matrix of a 5-site circular ring tight-binding Hamiltonian model. Note that the eigenvectors (or eigenfunction coefficients an are evaluated for each eigenvalue, which is very nice.
equation, and is well suited for actual calculations. If we choose to work with a restricted set of say 10 states, then we have a 10 × 10 matrix with 10 eigenvalues and their corresponding eigenfunctions. Fig. 6.4 shows a few examples of matrix evaluations using the Mathematica package. Historically Heisenberg developed quantum mechanics in its matrix representation and called it ”matrix mechanics”. Schrodinger found ¨ the algebraic version which appealed more to researchers since we are trained much better in algebra since high school than in matrices. But they are one and the same thing.
6.3 Matrices and algebraic functions Numbers are solutions to algebraic equations. We start our education learning about integers because they quantify the fingers on our hand, and soon expand into the regime of real numbers. Soon, we realize there are algebraic equations such as x2 + 1 = 0 which have solutions that are not real numbers, and realize there √ must be new kinds of numbers. Complex numbers contain i = −1, which (unfortunately2 ) is called an imaginary number. We learn how to add, subtract, multiply, divide, take square roots, exponentiate, etc... with numbers. One can visualize a matrix as an extension of the concept of a ”num-
2 ”One might think ... that imaginary num-
bers are just a mathematical game having nothing to do with the real world. From the viewpoint of positivist philosophy, however, one cannot determine what is real. All one can do is find which mathematical models describe the universe we live in. It turns out that a mathematical model involving imaginary time predicts not only effects we have already observed but also effects we have not been able to measure yet nevertheless believe in for other reasons. So what is real and what is imaginary? Is the distinction just in our minds?” – Stephen Hawking
128 Red or Blue Pill: Befriending the Matrix
ber”. For example, the algebraic equation ax = b has the solution x = b/a, a number. A set of algebraic equations with multiple variables written in the form AX = B has the solution X = A−1 B. Somewhere along the line, if we do not use matrices, we forget their power and beauty! We get busy using algebraic equations extensively. Turns out every algebraic equation may be written as a matrix equation. Then we can use powerful theorems of matrices to solve or analyze them. Indeed, most numerical approaches to solving equations have to go through the matrix route. Consider the equation of a unit circle x2 + y2 = 1.
(6.11)
This may not look like a matrix equation at all, till we define the x coordinate ”vector” X = . Its transpose is a row vector X T = y [ x, y], and the matrix version of the equation of the unit circle is then X T X = 1.
(6.12)
x2 + axy + y2 = 1,
(6.13)
Now consider the equation which can be written as 1 x y a−u 3 Such decompositions help understand
the curvature of a function. For example, in the calculation of bandstructures of crystals, A will take the form of a Hessian matrix, and the ”curvature” and topology of bands and their coresponding allowed eigenvalues and eigenstates can be concluded by examining the property of the matrices.
u 1
x y
= X T AX = 1.
(6.14)
This works for any value of u. So for the unit circle, A = I, where I is the unit matrix. The matrix A captures the geometric property of the curve – whether it is a circle, or a more complex shape3 . The strangest and most striking property of matrices is that they do not necessarily commute. Which is to say that in general for square matrices A and B, AB 6= BA. As a mathematical object, therefore they are quite distinct from real or complex numbers. Matrices thus form the natural objects for non-commutative algebra. Therefore they are central to the tenets of quantum mechanics, which is built upon the non-commutativity of operators as embodied by xˆ pˆ x − pˆ x xˆ = i¯h, which actually was derived by Heisenberg and Born for the first time in its matrix form [ x ][ p] − [ p][ x ] = i¯h[ I ]. A square matrix A has eigenvalues λi and eigenvectors [ xi ] which are obtained by solving the equation A[ x ] = λ[ x ] → [ A − λI ][ x ] = 0.
(6.15)
A = UDU −1 ,
(6.16)
After finding the eigenvalues and eigenvectors, the square matrix can be re-written in it’s spectral decomposition where D is a diagonal matrix of eigenvalues λ1 0 · · · D = 0 λ2 · · · , .. .. .. . . .
(6.17)
6.3 Matrices and algebraic functions 129
and the unitary transformation matrix U is formed by arranging the eigenvectors in the same order as the eigenvalues U = [ x1 ] [ x2 ] · · · . (6.18)
Note that U is invertible, meaning it’s determinant cannot be zero. Now let’s say the square matrix A is actually the Hamiltonian matrix of a quantum mechanics problem. Then solving the time-independent Schrodinger equation is equivalent to diagonalizing the Hamiltonian ¨ matrix by solving the matrix equation H11 H12 H13 . . . a1 a1 H21 H22 H23 . . . a2 a2 (6.19) H31 H32 H33 . . . a3 = E a3 . .. .. .. .. .. .. . . . . . . Clearly, it is equivalent to H11 − E H12 H21 H 22 − E H31 H32 .. .. . .
H13 H23 H33 − E .. .
... . . . . . . .. .
a1 a2 a3 .. .
= 0.
(6.20)
If instead of the infinite matrix, we choose a restricted eigenbasis set of N, then the solutions of the corresponding algebraic equation Det[ H − EI ] = 0 yield N eigenvalues En . Corresponding to each eigenvalue En , there is an eigenvector |ni, which is a column vector. We then construct the unitary operator U by collecting the eigenvectors and write the Hamiltonian matrix in its diagonal form as H11 H12 H13 . . . H1N E1 0 0 ... 0 H21 H22 H23 . . . H2N 0 E2 0 . . . 0 H31 H32 H33 . . . H3N 0 E3 . . . 0 =U 0 U −1 .. .. .. . . . . . . . .. .. .. .. .. .. . .. . . HN1
HN2
HN3
...
HNN
...
EN
(6.21) This is the spectral decomposition of the Hamiltonian H = UDU −1 , where D is a diagonal matrix whose elements are the energy eigenvalues. The exact solution requires the matrices to be infinite-dimensional, but for most practical cases we work with a restricted set. Now let’s imagine that the Hamiltonian matrix is perturbed to H → H0 + W. The eigenvalues and eigenfunctions will change. But the expansion principle tells us that the new state vector of the perturbed system can still be expanded in terms of the unperturbed eigenvectors, or the matrix U. The matrix formalism makes such perturbations easy to deal with. We will return to this problem in the next chapter. The sum of the diagonal elements of a square matrix is called its trace, Tr[ H ] = ∑n Hnn . For square matrices A, B, Tr[ AB] = Tr[ BA].
130 Red or Blue Pill: Befriending the Matrix
Thus, we get Tr[ H ] = Tr[UDU −1 ] = Tr[U −1 UD ] = Tr[ D ] = ∑n En . The trace of the Hamiltonian is the sum of its eigenvalues. a1 The quantum states are represented as column vectors |Ψi = a2 , .. . b1 as discussed earlier. Consider another quantum state |Φi = b2 . If .. . we take the projection hΦ|Ψi = ∑n an bn? , we get a number. This is analogous to taking the dot product of two vectors, and is called the inner product for Dirac notation. But we can also take an outer product a1 a1 b1? a1 b2? . . . a1 b?N ? ? ? a2 a2 b1 a2 b2 . . . a2 b N |ΨihΦ| = . b1? b2? . . . b?N = . , . . .. .. .. .. .. . a N b1?
a N b?N (6.22) which is no longer a number but a matrix. This matrix clearly retains the information of the phases of each quantum state and their respective projections on eigenstates. This information is lost when the inner product is taken. The outer product leads to the concept of density matrices, which keep track of the phase relationships and interferences of quantum states4 . An interesting result is that the trace of the inner and outer products are the same, i.e. Tr[|ΨihΦ|] = Tr[hΦ|Ψi] = ∑n an bn? . The mathematical name for objects like the outer product |mihn| is a dyadic, it is a tensor constructed out of two vectors. That the outer product is a matrix implies we can think of it as an operator. In fact, we can construct operators of the form Aˆ = ∑ an |nihn| with suitable coefficients an . One such construction will prove rather useful. Consider the Schrodinger equation Hˆ |ni = En |ni with eigen¨ values En and eigenvectors |ni. We define a new operator aN
4 Density matrices prove useful to track
the evolution of quantum states in various schemes of quantum computation.
Gˆ ( E) =
a N b2?
...
|nihn|
∑ E − En , n
(6.23)
where the coefficients are an = 1/( E − En ), with units of inverse energy. This operator is called the Green’s function operator. We will use it in the later chapters. For now, note that it blows up every time E = En , and changes sign as E crosses En . Note what happens when the Green’s function operator acts on an eigenstate |mi: Gˆ ( E)|mi =
|nihn|
|ni
∑ E − En |mi = ∑ E − En hn|mi = n
n
1 | m i. E − Em
(6.24)
This happens for every eigenvalue, because |mi can be any of the eigenvectors. We can picturize the Green’s function operator as a ”energy-eigenvalue detector”. As we sweep the energies, every time E = Em , there is a very large response since Gˆ ( Em )|mi → ±∞. The
6.4
Properties of matrix eigenvlaues 131
response is low when E 6= Em . This is analogous to the concept of a ”impulse response” in linear systems. The Schrodinger equation may be written as ( E − Hˆ 0 )|ψi = 0. Then ¨ note that if we apply the Green’s function operator on the left side of the equation, we get Gˆ ( E)( E − Hˆ 0 )|ψi =
|nihn|
∑ E − En (E − Hˆ 0 )|ψi = ∑ |nihn|ψi = |ψi. n
n
(6.25) From the above, it is clear that Gˆ ( E) = ( E − Hˆ 0 )−1 , i.e., the Green’s function operator is the inverse operator of ( E − Hˆ 0 ). You can think of this in terms of matrices to make it more concrete5 .
6.4 Properties of matrix eigenvlaues A few properties of eigenvalues of square matrices that will prove useful in later chapters are summarized here without proof. They are:
5 Also, the right side of the Schrodinger ¨
equation was zero, meaning Gˆ ( E)0 = |ψi. This may seem weird because the Green’s function operator seems to act on ”zero” to create the state |ψi. We will return to this strange behavior in Chapter 13 to explore what it is trying to say.
(1) If a N × N matrix M is equal to its complex conjugate transpose M = [ M? ] T , then all its N eigenvalues are real. Such matrices are called Hermitian, and this is the famous rule that Hermitian matrices are guaranteed to have real eigenvalues. (2) The determinant of a matrix is equal to the product of its eigenvalues, i.e., Det[ M] = λ1 λ2 ...λ N . (6.26) (3) The order of a product of matrices inside the determinant does not matter (in fact, the matrices A and B do not have to be square matrices either!) Det[ AB] = Det[ BA].
(6.27)
(4) The sum of the diagonal elements, or the Trace of a matrix is equal to the sum of its eigenvalues6 , i.e., Trace[ M] = λ1 + λ2 + ...λ N .
(6.28)
(5) Even if matrices do not commute, their trace does commute, just like the determinant, and the matrices do not have to be square: Trace[ AB] = Trace[ BA].
(6.29)
6.5 Looking ahead What is the advantage of the spectral decomposition of a matrix A = UDU −1 ? Let’s observe what happens when we try to square the matrix A.
6 When we evaluate the electron band-
structure of semiconductors in later chapters, this mathematical property translates to physical sum rules for eigenvalues in bands.
132 Further reading
A2 = UD (U −1 U ) DU −1 = UD2 U −1
λ21 =U 0 .. .
0 λ22 .. .
··· −1 · · · U . (6.30) .. .
U −1 U = I contracts the expansion, and we only have to square the diagonal matrix, which is trivial since we just square the eigenvalues! Think of any higher order power of the matrix: the U’s always contract, so A N = UD N U −1 ! If we visualize matrices as extensions of real and complex numbers, we should be curious about doing similar operations on them. For example, what is the square root of a matrix? What is the exponential or logarithm of a matrix? Can we take sines and cosines of matrices? The answer to all of these questions is yes, and the spectral decomposition is the first step to such fun. In this chapter, we discussed the linear properties of matrices, which will help us get started with perturbation theory.
6.6 Chapter summary section In this chapter, we learned:
• Quantum states can be represented by analytical wavefunctions of the form ψ( x ) = ∑n an ψn ( x ), or equivalently as the column vectors [ψ] = [..., an , ...] T . The Schrodinger equation in the dif¨ h¯ 2 ∂2 ferential form [− 2m ∂x2 + V ( x )]ψ( x ) = Eψ( x ) is equivalent to Heisenberg’s matrix form [ H ][ψ] = E[ψ]. • Physical observables that are represented by operators in Schrodinger’s formulation are represented by corresponding ma¨ trices in Heisenberg’s matrix formulation. One can visualize the matrix of the physical observable [ A] as an operator acting on a quantum state vector [ψ], and giving a new state vector [ψ0 ], i.e. [ A][ψ] = [ψ0 ]. The eigenvalues and eigenfunctions of the matrix are the eigenvalues and eigenfunctions of the quantum system for that physical observable, i.e. [ A][ψ A ] = A[ψ A ]. • Several properties of matrices and their computer implementation will be used for the solution of quantum mechanical problems that present difficulties in the analytical formulation.
Further reading
Exercises 133 Most books on electronic properties of solids or quantum mechanics stick to the differential equation approach and do not embrace matrices. Unless you are a math major or a math aficionado, you typically see matrices in introductory math courses, and then grow comfortable with them by applying them in courses in science and engineering. There are too many good books on matrices and linear algebra, so I will be brief. To learn the properties of matrices to your heart’s content, I recommend Linear Algebra and its Applications by G. Strang. Introduc-
tion to Fourier Analysis and Generalised Functions by M. Lighthill presents a solid foundation on Fourier series that connects to Dirac’s insights to its connection to quantum mechanics. Max Born and Heisenberg had developed the first consistent form of quantum mechanics in the matrix form; the book Quantum Mechanics in Simple Matrix Form by T. Jordan explains quantum mechanics from the matrix perspective in an approachable and illuminating way.
Exercises (6.1) Some properties of matrices (a) The trace of a square matrix is the sum of its diagonal elements. Show that though two square matrices A and B may not commute AB 6= BA, their trace does: Trace( AB) = Trace( BA).
crystal maps exactly to the following matrix problem. In this problem, we get comfortable in solving energy eigenvalue problems on the computer using packages (for example, see Fig. 6.4). Include plots in your solution whenever applicable.
(b) Using the spectral decomposition of a square matrix A = UDU −1 and part (a) of this problem, show that the trace of a square matrix is equal to the sum of all the eigenvalues En of the square matrix, Trace( A) = ∑n En .
(a) As a warm-up, re-do the examples in Fig. 6.4 in your program of choice. For example, if you are using Mathematica, get comfortable with the Eigenvalue and Eigenfunction commands. For other programs, identify the corresponding commands and obtain the solutions of Fig. 6.4.
(c) For any complex number z, the Taylor series 2 3 of its exponent is ez = 1 + z + z2! + z3! + .... Using the spectral decomposition of a square matrix A = UDU −1 , show that a square matrix A follows the same exponential relation as a complex mum2 3 ber: e A = I + A + A2! + A3! + .... (d) For two complex numbers z1 and z2 , ez1 ez2 = ez1 +z2 . Verify that the matrix version of this relation e A e B = e A+ B+[ A,B]/2+... , where [ A, B] = AB − BA is the commutator of the matrices A and B. Several commutators of the matrices A and B show up in the exponent. This is the Baker–Campbell–Hausdorff formula that plays an important role in distinguishing quantum phenomena from classical phenomena for which the commutators are always zero. (6.2) Using matrices to solve energy eigenvalue problems The next few chapters show that the quantum mechanical problem of an electron in a semiconductor
(b) Consider N identical atoms, each of which has an allowed electron energy eigenvlaue E0 when the atoms are far apart and do not interact. But when the atoms are arranged in a 1D periodic crystal (say labelled 1, 2, .., n, ..., N), an electron can hop from an atom to its neighboring atoms. Let’s assume that the electron only hops from an atom to its nearest neighbors to the left and the right, i.e. from n → n ± 1. In doing so, it can lower its energy by t0 . Show why the Hamiltonian of this crystal is then the N × N square matrix E0 t0 0 ... 0 t0 E0 t0 . . . 0 0 t0 E0 . . . 0 (6.31) . . .. .. .. .. .. . . . . 0 0 0 . . . E0
Note that the original (or unperturbed) eigenvalues E0 sit in the diagonal terms, and the hopping terms t0 sit in bands next to the diagonal terms.
134 Exercises Argue from physical grounds why the sign of the hopping terms should typically be negative. (c) Implement the tightbinding matrix of Equation 6.31 on the computer for N = 2. Call it ”tightbinding2by2”. Find its eigenvalues. How many eigenvalues are there? Verify this by calculating the eigenvalues by hand. (c) Implement the tightbinding matrix of Equation 6.31 on the computer for N = 3. Call it ”tightbinding3by3”. Find its eigenvalues. How many eigenvalues are there? (d) Scale the implementation to N = 100, and call the matrix ”tightbinding100by100”. Use E0 = 1 eV and a hopping term t0 that you can manipulate. What are the N eigenvalues when t0 = 0? Is it reasonable? This collection of eigenvalues is called the bandstructure, just that for t0 = 0 there is not much structure to it! (e) Using your matrix ”tightbinding100by100”, plot the bandstructure for a few t0 by manipulating its value. What is the width of the band (i.e. the difference of the maximum and minimum eigenvalues)? For t0 = −0.3 eV, if I put N/5 = 20 electrons into the crystal, where will the Fermi energy EF be at T → 0 K? (f) Modify ”tightbinding100by100” to allow for electrons to hop to the next-nearest neighbors, meaning from atom number n → n ± 2, with a slightly lower strength t1 . How does the bandstructure change? (6.3) The Pauli spin matrices The four 2 × 2 matrices
1 0 , 0 1 0 1 σ1 = σx = , 1 0 0 −i σ2 = σy = , i 0 σ0 = I2 =
and
(6.33) (6.34)
1 0 (6.35) 0 −1 are rather special in quantum mechanics. The first is the 2×2 identity matrix, and σx , σy , and σz are σ3 = σz =
(6.32)
called the Pauli spin matrices. (a) Verify that σx2 = σy2 = σz2 = I2 , i.e. in matrix language, the three Pauli matrices are the square roots of unity. (b) Verify that each one of the three Pauli spin matrices are Hermitian. (c) Now consider the Pauli matrix operator acting on a quantum state [ψ] to give σx [ψ] = [φ]. Acting on the same state again gives σx σx [ψ] = σx2 [ψ] = [ψ]. Argue why this implies the eigenvalue of σx2 is +1, and since σx is Hermitian, the eigenvalues of σx are ±1, and this holds for each of the three Pauli matrices. (d) Verify that the Pauli matrices form a cyclic commutation group, i.e., [σx , σy ] = σx σy − σy σx = 2iσz . The physical meaning of this relation: a rotation operation of a quantum state vector about the x axis followed by one about the y axis does not lead to the same state vector if the operations are done in reverse, they are off by a rotation around the z axis. (6.4) Fundamentals of 2 × 2 matrices Consider the general 2×2 Hermitian matrix a b H2 = ? . (6.36) b c (a) Show that the matrix can be written as H2 = h0 I2 + h ·~σ, where h = (h x , hy , hz ) is a vector whose components are three complex numbers, and the three components of ~σ = (σx , σy , σz ) are the three Pauli matrices. Express h0 , h x , hy , hz in terms of the matrix elements a, b, c. (b) Because the eigenvalues of the Pauli matrices are ±1, show that the eigenvalues of the general 2×2 matrix are h0 ± |h|. (c) The general structure of the 2 × 2 matrix will be of utmost importance in later chapters to explain the formation of semiconductor bandstructure, and in general quantum mechanical problems of transitions and time evolution. Explore a few properties of the eigenfunctions and eigenvalues of the matrix by looking at special cases of the matrix elements, such as a = c, b = 0, small values of b, etc.
Perturbations to the Electron’s Freedom In Chapter 5 we discussed a few exactly solved problems in quantum mechanics. We also found that the many applied problems may not be exactly solvable in an analytical form. The machinery to solve such problems is perturbation theory. To understand the physics of semiconductors and their nanostructures, we need the tool of perturbation theory in our arsenal. In Chapter 6, we developed the matrix formalism of quantum mechanics, which is well-suited to handle perturbation theory. Sometimes we will be able to reduce the matrix solutions to closed-form algebraic forms, which always helps in visualization. In this chapter, we develop an additional analytical tool for perturbation theory that is indispensable in the insight it provides. The goals of this chapter are then the following:
• Develop a matrix approach to perturbation theory, which will use the known eigenvalues and eigenfunctions of a quantum mechanics problem represented by the Hamiltonian Hˆ 0 to identify the new eigenvalues and eigenfunctions upon a perturbation that ˆ changes the Hamiltonian to Hˆ 0 + W. • Use some simplifications of non-degenerate energy eigenvalue systems to develop non-matrix based, algebraic results of perturbation theory. • Become comfortable with the analytical Rayleigh–Schrodinger ¨ and Brillouin–Wigner non-degenerate perturbation theory results, and learn when to apply them, and when they are not applicable. • Appreciate that the matrix version of perturbation theory is always applicable, both for degenerate and non-degenerate problems by applying it to examples in preparation of its use in semiconductors. Let Hˆ 0 be the Hamiltonian for the solved problem. Then the time∂ dependent Schrodinger equation is i¯h ∂t |Ψi = Hˆ 0 |Ψi. The eigenstates ¨ of definite energy are also stationary states hr |ni = ψE (r )e−iEn t/¯h , where |ni are the eigenvectors and En the corresponding eigenvalues. Note that all the solved problems we discussed in Chapter 5 such as the harmonic oscillator or the particle in a box had time-independent potentials.
7 7.1 Degenerate perturbation theory 136 7.2 Non-degenerate perturbation theory 138 7.3 The Brillouin–Wigner perturbation results 141 ¨ 7.4 Rayleigh–Schrodinger perturbation results 142 7.5 The Hellmann–Feynman theorem 143 7.6 Perturbation theory example
144
7.7 Chapter summary section
147
Further reading
148
Exercises
149
136 Perturbations to the Electron’s Freedom
Many real-world situations involve time-dependent potentials. For example, imagine a field-effect transistor whose gate voltage is being modulated by an ac signal. That will create a potential variation for electrons of the form V (r )eiωt . A similar variation will be experienced by electrons interacting with photons of an electromagnetic wave, or with phonons of lattice vibrations. Consider the limit of very small frequencies ω → 0, or a ”DC” potential. Then, the potential only has a spatial variation. A dc voltage is not truly time-independent because it has to be turned on or off at some time. But most of the physics we are interested in this and a few following chapters happens when the perturbation has been ”on” for a long time in the past, and things have reached steady-state. It is in this sense that we discuss timeindependent perturbation theory. We defer explicitly time-varying or oscillatory perturbations to Chapter 20 and beyond.
7.1 Degenerate perturbation theory The time-independent Schrodinger equation for the solved problem is ¨ Hˆ 0 |ni = En0 |ni,
(7.1)
( Hˆ 0 + W )|ψi = E|ψi.
(7.2)
where Hˆ 0 is the unperturbed Hamiltonian. That means we know all the eigenfunctions |ni and their corresponding eigenvalues En0 . This is shown in Fig. 7.1. Lets add a perturbation W to the initial Hamiltonian such that the new Hamiltonian becomes Hˆ = Hˆ 0 + W. The new equation is Schrodinger ¨
The perturbation W has changed the eigenvectors from |ni → |ψi. The old eigenvalues may not be eigenvalues of the new Hamiltonian. Some eigenvalues increase in energy, some decrease, and others may not be affected. This is illustrated in Fig 7.1. So we have to solve for the new eigenvalues E and obtain the corresponding eigenvectors. At this stage, we invoke the expansion principle introduced in Chapter 6. It states that the perturbed state vector |ψi can always be expressed as a linear superposition of the unperturbed eigenvectors |ni, since the unperturbed eigenstates form a complete basis set. It is the same philosophy of expanding any function in terms of its Fourier components. Thus we write
| ψ i = ∑ a n | n i,
(7.3)
n
where an ’s are (in general complex) expansion coefficients. The coefficients are obtained by taking the projection hm|ψi, which yields an = hn|ψi. Then Equation 7.2 reads
∑ an ( Hˆ 0 + W )|ni = E ∑ an |ni. n
n
(7.4)
Degenerate perturbation theory 137
7.1
Fig. 7.1 Illustrating the effect of a perturbation. The initial eigenstates and eigenvalues of a quantum system change upon application of a perturbation W.
We can visualize the new state vector |Ψi as the original eigenvector ”rotated” by the perturbation W, as we had introduced in Chapter 6, and specifically in Fig. 6.3. Let us project the left and right hand sides of Equation 7.4 on hm| to get
∑ an hm|( Hˆ 0 + W )|ni = Eam ,
(7.5)
n
which is a matrix when m takes values 1, 2, . . . N:
E1 + W11 W21 W31 .. . WN1
W12 E2 + W22 W32 .. . WN2
W13 W23 E3 + W33 .. . WN3
... ... ... .. . ...
W1N W2N W3N .. .
EN + WNN
a1 a2 a3 .. .
aN
= E
a1 a2 a3 .. .
aN (7.6) The eigenvalues and the corresponding eigenvectors of this matrix equation are obtained by diagonalization, as discussed in Chap0 ter 6. The new eigenvalues En thus depend on the matrix elements Wmn = hm|W |ni of the perturbation. Note that if some eigenvalues of the unperturbed Hamiltonian happened to be degenerate, the matrix diagonalization method takes that into account naturally without problems. In that sense, the matrix formulation of perturbation theory is sometimes referred to as degenerate perturbation theory. But the matrix formulation handles non-degenerate situations equally well, and is perfectly general. In case we did not start with a diagonal basis of the unperturbed
.
138 Perturbations to the Electron’s Freedom
Hamiltonian H 0 , then we have the Schrodinger equation ¨ 0 0 +W 0 +W H11 + W11 H12 . . . H1N a1 12 1N 0 +W 0 +W H 0 + W21 H . . . H 22 2N 22 2N 21 a2 . .. .. .. . . .. .. .. . . . 0 +W HN1 N1
0 +W HN2 N2
0 HNN + WNN
= E
a1 a2 .. .
aN (7.7) The solutions thus reduce to diagonalizing the corresponding perturbed matrices. Many of the perturbation matrix elements Wmn = hm|W |ni can be made zero by suitable choice of bases, which reduces the work involved in diagonalization. Note that we can easily obtain the eigenvalues, the eigenvectors typically require more work. But with the availability of math packages such as Mathematica and MATLAB, this is done in a jiffy for most situations we will deal with. An important question in applying degenerate perturbation theory is: which states should be included in the N × N matrix? We state the guiding principles, which will be proven in the next section on non-degenerate perturbation theory. First principle: Eigenstates with eigenvalues widely separated in energy En − Em interact weakly by the perturbation. Second principle: The change in the eigenstates energy is proportional to the perturbation matrix element squared |Wmn |2 . Quantitatively, the perturbation in energy of a state with energy E is ∆E ≈ |Wmn |2 /( E − En ) by interacting with state |ni. Thus, states for which Wmn terms are small or zero may be left out. If we are interested in a set of energy eigenvalues (say near the conduction and valence band edges of a semiconductor), energies far away from the band edges may also be left out. We will see the application of these rules in many following chapters. We will see examples of degenerate perturbation theory in the next chapter (Chapter 8), where we will apply it to the problem of a free electron. That will require us to solve either 2×2 matrices, or of higher order, depending on the accuracy we need. Later on, we will also encounter it when we discuss the k · p theory of bandstructure to deal with the degeneracies of heavy and light valence bands. For now, we look at particular situations when we have ”isolated” eigenvalues that are non-degenerate. ...
aN
7.2 Non-degenerate perturbation theory Schrodinger’s crowning achievement was to obtain an algebraic equa¨ tion, which when solved, yields the quantum states allowed for electrons. Schrodinger’s equation is called the ”wave”-equation because ¨ it was constructed in analogy to Maxwell’s equations for electromagnetic waves. Heisenberg was the first to achieve the breakthrough in quantum mechanics before Schrodinger, except his version involved ¨ matrices which he called matrix mechanics. Because matrices were
.
7.2 Non-degenerate perturbation theory 139
Fig. 7.2 Perturbation of vector state. The perturbation rotates the eigenvector |ui to |ψi. If we forego normalization of |ψi, we can find a vector |φi orthogonal to |ui such that hu|φi = 0, and consequently hu|ψi = 1.
unfamiliar at that time in physics, this formulation was not immediately accepted. It was later that the mathematician von Neumann (Fig. 7.3) proved that both approaches were actually identical from a mathematical point of view. So at this point, we will try to return to a ”familiar” territory in perturbation theory from the matrix version presented in the previous section. We try to formulate an algebraic method to find the perturbed eigenvalues and eigenvectors. ˆ added to the solved (or unperturbed) Consider a perturbation W Hamiltonian Hˆ 0 . The Schrodinger equation is ¨ ˆ )|ψi = E|ψi, ( Hˆ 0 + W
(7.8)
and the unperturbed state |ui satisfies
Hˆ 0 |ui = Eu |ui.
(7.9)
The new quantum state differs from the unperturbed state, so we write | ψ i = | u i + | φ i. (7.10)
We can picturize the final state |ψi as a ”vector” sum of the unperturbed state |ui and the vector |φi. This is schematically shown in Fig. 7.2. In particular, if we are willing to not normalize the final state, then we can always choose |φi to be orthogonal to |ui, leading to hu|φi = 0 and hu|ψi = 1. We can then project Equation 7.8 on hu| to obtain the energy equation E = Eu + hu|W |ψi =
Eu |{z}
unperturbed
+ h u |W | u i + h u |W | φ i . | {z } | {z } ∆E(1)
(7.11)
higher orders
Note that we obtain the ”first-order” energy correction ∆E(1) : they are the diagonal matrix elements of the perturbation with the unperturbed eigenstates. Think of a ”DC” perturbation – say a voltage V0 that depends neither on space nor time. All the initial energy eigenvalues shift by the corresponding energy: Eu → Eu + qV0 due to the first
Fig. 7.3 John von Neumann was one of the foremost mathematicians of the 20th century. He rigorously established the mathematical principles of quantum mechanics. He was instrumental in the conception and building of the first digital computer. The computer architecture of most microprocessors uses the von Neumann architecture.
140 Perturbations to the Electron’s Freedom
order term, since hu|qV0 |ui = qV0 . We will shortly see that for this particular perturbation, the higher order terms are zero because they depend on the cross-matrix terms of the kind hm|qV0 |ni = qV0 hm|ni = 0 for n 6= m. An example of such a situation is when a voltage is applied across a gate capacitor to a semiconductor – the entire bandstructure which consists of the allowed En ’s shift rigidly up or down. We call this energy band-bending in device physics. Such a ”DC” perturbation does not couple different energy states for the above reason, and results in only a first-order rigid shift in energies. Most perturbations are not of the ”DC”-kind; thus we need the higher order terms for them. To do that, it is useful to define Eu0 = Eu + hu|W |ui.
(7.12)
We then split the diagonal and off-diagonal elements of the perturbation just like decomposing a signal into its ”DC” + ”AC” terms. ˆ as an operator. Hence we are splitting a matrix into two: Think of W ˆ =D ˆ + Wˆ 0 , W (7.13) and the total Hamiltonian then is ˆ +Wˆ 0 . Hˆ = Hˆ 0 + D | {z }
(7.14)
Hˆ (d)
The reason for doing this is that the unperturbed eigenvalues are ˆ without ingoing to shift by the diagonal part of the perturbation D ˆ will further teracting with other states. The off-diagonal terms in W tweak them by interactions with other states. To move further, we write ( Hˆ (d) + Wˆ 0 )|ψi = E|ψi, (7.15) and rearrange it to
( E − Hˆ (d) )|ψi = Wˆ 0 |ψi, (7.16) At this stage, our goal is to find the perturbation vector |φi = |ψi − |ui. How can we obtain it from the left side of Equation 7.16 in terms of the perturbation on the right? Recall in Chapter 6 we discussed the Green’s function operator briefly. We noticed that it is an ”inverse” operator, meaning we expect Gˆ ( E)( E − Hˆ (d) )|ψi =
|mihm|
∑ E − E0 m
m
( E − Hˆ (d) )|ψi = ∑ |mihm|ψi = |ψi. m
So to get |φi = |ψi − |ui, perhaps we should use the operator
|uihu| |mihm| Gˆ ( E) − = ∑ . 0 0 E − Eu E − Em m6=u
(7.17)
(7.18)
Operating on the LHS of Equation 7.16 we obtain
∑
m6=u
|mihm| ( E − Hˆ (d) )|ψi = ∑ |mihm|ψi 0 E − Em m6=u
= (∑ |mihm|ψi) − |uihu|ψi = |ψi − |ui = |φi, m
(7.19)
7.3
The Brillouin–Wigner perturbation results 141
which is what we wanted. Now we use the same operator on the right of Equation 7.16 to finish the job. Since Wˆ 0 consists of only offdiagonal cross matrix elements, we write it in its outer product form ˆ |nihn|, and apply the ”reduced” Green’s as Wˆ 0 = ∑m ∑n6=m |mihm|W function to get ˆ |l ihl | ˆ |nihn|ψi = ∑ ∑ |mi hm|W |ni hn|ψi, |mihm|W 0 0 E − El E − Em m6=u n6=m l 6=u m n6=m (7.20) Thus, we obtain the perturbed state |ψi = |ui + |φi to be
|φi =
∑∑ ∑
|ψi = |ui +
∑ ∑
m6=u n6=m
|
|mi
ˆ |ni h m |W hn|ψi . 0 E − Em {z }
(7.21)
|φi
ˆ = 0, |ψi = |ui, as it should be. As a sanity check, we note that if W Next, we note that this is a recursive relation, meaning the unknown |ψi also appears inside the sum on the right side. Thus, by successive substitution of the left hand side into right, it can be taken to many orders. But we are going to retain just up to the second order. That means, we will assume that the perturbation is weak, and so we are justified in replacing the |ψi inside the sum on the right side by the unperturbed state |ui.
7.3 The Brillouin–Wigner perturbation results From the above considerations, we get the result for the perturbed eigenstates ˆ |ui h m |W |ψi ≈ |ui + ∑ |mi . (7.22) 0 E − Em m6=u | {z }
Fig. 7.4 Leon Brillouin, as in the Brillouin function, and the Brillouin zone in solid–state physics, is also the ”B” of the WKB approximation. He discovered one of the three fundamental light-matter scattering processes (the other two being Rayleigh and Raman scattering).
| φ i (1)
The perturbed state vector given by Equation 7.22 is now in a form that can be used for calculations. That is because every term on the right side is known, except for the energy E in the denominator. To obtain the perturbed eigenvalues E, we substitute Equation 7.22 into the expression for energy E = Eu + hu|W |ui + hu|W |φi to obtain ˆ |ui|2 |hm|W . E ≈ Eu + hu|W |ui + ∑ 0 | {z } m6=u E − Em | {z } ∆E(1)
(7.23)
∆E(2)
This result is called the Brillouin–Wigner (BW) perturbation theory after those who developed it (Figs. 7.4 and 7.5). Note that we
Fig. 7.5 Eugene Wigner, a mathematical physicist who introduced seminal notions of symmetry in atomic spectra, quantum mechanics, and solid-state physics. Wigner was awarded the 1963 Nobel Prize in physics. Wigner’s memorable statement: ”It is nice that the computer understands the problem. But I would like to understand it too”. Dirac was married to Wigner’s sister.
142 Perturbations to the Electron’s Freedom
need to solve for the unknown eigenvalues E. For multiple states, the solution would require finding the roots of a high order polynomial, since Equation 7.23 is indeed a polynomial. For example, let’s say we were looking at a three-state problem with unperturbed energies Eu1 , Eu2 , Eu3 , and we want to find how the eigenvalue of state u = 2 is modified by the perturbation. Then, the second energy energy correction has 2 terms, since m 6= 2. The equation then becomes a third-order polynomial with three roots, which are the desired eigenvalues.
¨ 7.4 Rayleigh–Schrodinger perturbation results Fig. 7.6 Lord Rayleigh was the codiscoverer of argon, and of Rayleigh scattering of long wavelength light waves from matter that explains why the sky is blue. The perturbation problem is essentially also a scattering problem in disguise, because one can imagine the unperturbed states being scattered into new states because of the perturbing potential. Rayleigh was awarded the 1904 Nobel Prize in Physics. J. J. Thomson, the discoverer of the electron, was Rayleigh’s student.
The BW perturbation theory requires us to solve the polynomial equations for the perturbed energies. This can be avoided if we are willing to compromise on the accuracy. If so, the unknown energy term E in the denominator of the second order correction term may be replaced by the unperturbed value, E → Eu . Then the energy eigenvalues are obtained directly from ˆ |ui + E ≈ Eu + hu|W
ˆ |ui|2 |hm|W , 0 Eu − Em m6=u
∑
(7.24)
and the eigenfunctions are
|ψi ≈ |ui +
ˆ |ui h m |W |mi . 0 E − Em m6=u u | {z }
∑
(7.25)
φ (1)
This set of perturbed eigenfunction and eigenvalues is called the ¨ Rayleigh–Schrodinger (RS) perturbation theory. Note that in this form, we know all the terms on the right side. It was first derived by Schrodinger right after his seminal work on the wave equation of ¨ electrons. The RS-theory is not applicable for understanding perturba0 can go to zero. But tion of degenerate states, as the denominator En − Em BW-theory applies for degenerate states too, and one can always resort back to the degenerate perturbation theory. Schrodinger originally de¨ rived this result and referred to Rayleigh’s (Fig. 7.6) work on classical perturbation theory of the effect of inhomogeneities on the vibration frequencies of mechanical strings. The quantum naming scheme pays homage to the quantum and the classical versions. In the treatment of degenerate perturbation theory earlier, we discussed the strategy to choose which states to include in the matrix. The last term in the BW or RS perturbation theory results provides the guiding principle. Note that this term goes as the perturbation matrix element squared, divided by the energy difference. In the absence of the perturbation, the eigenvectors corresponding to the eigenvalues were
7.5 The Hellmann–Feynman theorem 143
Fig. 7.7 Illustration of revel repulsion due to perturbation.
orthogonal, meaning they did not ”talk” to each other. The perturbation mixes the states, and makes them talk. The magnitude by which the energy of a state |ui changes due to interactions with all other states 0 ). upon perturbation is ∆E(2) ≈ ∑m6=u |Wmu |2 /( Eu − Em We also note the nature of the interaction. If a state Eu is interacting 0 lower than itself, then ∆E(2) > 0, the with states with energies Em perturbation pushes the energy up. Similarly, interactions with states with higher energies pushes the energy of state Eu down. Thus, the second-order interaction term in perturbation is repulsive. Fig. 7.7 illustrates this effect schematically. This repulsive interaction is the key to understanding curvatures of energy bands and the relation between effective masses and energy bandgaps of semiconductors. Clearly if two states were non-degenerate and the strength of the perturbation is increased from zero, the energy eigenvalues repel stronger, and the levels go farther apart. Then they cannot cross each other1 .
7.5 The Hellmann–Feynman theorem The following theorem is sometimes useful for obtaining quick estimates of the magnitude and direction of the shift in energy eigenvalues upon perturbation. Let Hˆ 0 (λ)|k i = Ek (λ)|k i be the exactly solved problem with normalized eigenstates |ki, where the Hamiltonian Hˆ 0 (λ) and its resulting eigenvalues Ek (λ) depend on a parameter λ. When we change the parameter λ in the Hamiltonian operator Hˆ 0 (λ), how do the eigenvalues Ek (λ) change? Because the original eigenstates are orthonormal, we have hk|k i = 1. d Differentiating with respect to the parameter λ, we have dλ hk|ki = 0.
1 This goes by the name of no level cross-
ing theorem of perturbation theory in quantum mechanics.
144 Perturbations to the Electron’s Freedom
Now, applying the chain rule for differentiation, d d E (λ) = hk| Hˆ 0 (λ)|ki (7.26) dλ k dλ d d d = h k| Hˆ 0 (λ)|ki + hk| Hˆ 0 (λ)|ki + hk| Hˆ 0 (λ)| ki, dλ dλ dλ and because Hˆ 0 (λ)|ki = Ek (λ)|k i and hk | Hˆ 0 (λ) = Ek (λ)hk|, we get d d d d E (λ) = hk| Hˆ 0 (λ)|ki + Ek (λ)[h k|ki + hk| ki] dλ k dλ dλ dλ | {z }
(7.27)
d dλ h k | k i=0
=⇒
dEk (λ) d Hˆ 0 (λ) = hk| | k i. dλ dλ
The boxed equation above is the Hellmann–Feynman theorem. It states that we can get the perturbation in the energy eigenvalues due to a parameter λ by finding the inner product of the derivative of the Hamiltonian operator with respect to the variable λ.
7.6 Perturbation theory example Consider the particle-in-a-box problem that was discussed in Chapter 5, Section 5.6, as shown in Fig. 7.8. Here is the question: we know the exactly q solved particle in a box eigenfunctions of state |ni: h x |ni = 2
Fig. 7.8 A perturbation to the particle-ina-box problem.
h¯ ψn ( x ) = L2 sin(n πL x ) and corresponding eigenvalues En = 2m (n πL )2 . e Introduce a tiny perturbation potential of strength W ( x ) = +W0 over a length a 0 for S > 0, and the lowest band is rather narrow. This means the electron is ”sluggish” in this band, and it has a large effective mass. As we move up to higher energies, the points of degeneracy develop sharper curvatures and the bands become wider, making the electron effective mass lighter. Note the differences for the attractive delta potential (S < 0) band
274 1, 2, 3 ... ∞: Pseudopotentials and Exact Bandstructure
structures highlighted by the right panel in Fig. 13.4, and drawn at exactly the same scale for easy comparison. The lowest energy allowed − now is Emin < 0 for S < 0, i.e.. it is negative in stark contrast to the situation for S > 0. The Hellmann–Feynman theorem now guarantees that the eigenvalues are lower than the NFE case. At the k-points of degeneracy, the splitting is such that one eigenvalue stays put again, but the other is pushed down, exactly opposite to the case of S > 0. The lowest eigenvalue of each Kronig–Penney band is degenerate with the NFE eigenvalues of E(k = n πa ) = n2 · F again, where n = 1, 2, ..., locating energy eigenvalues F, 9F, ... at k = ± πa at the BZ edge, and 4F, 16F, ... at k = 0 as now the minima of the corresponding bands.
13.3 Tight-binding models emerge from Kronig–Penney In Chapter 11 we discussed the tight-binding model for bandstructure starting from atomic orbitals. We will now see that tight-binding models emerge naturally from the exact Kronig Penney model. Applying ∞ 1 the trigonometric identity cot( x ) = ∑+ −∞ nπ + x to the right hand side of the central Kronig–Penney eigenvalue Equation 13.8, with Gn = n 2π a , Equation 13.8 transforms into: cos(ka) = cos(qa) +
3 The maximum value of
sin x x
is 1.
mSa sin(qa) · , qa h¯ 2
(13.9)
q where q = 2mEk /¯h2 . This is still an exact solution of the Schrodinger ¨ equation. Now the values of Ek that satisfy this equation will form the energy bandstructure E(k) for each k. The left hand side is limited to −1 ≤ cos(ka) ≤ +1, but the RHS of Equation 13.9 can reach values 3 up to 1 + mSa 2 = 1 + C which can clearly exceed unity . This restricts h¯
h¯ 2 q2
the allowed values of q for real energy eigenvalues E = 2m for each k. Fig. 13.5 shows the ”bands” of q where the RHS lies between −1 ≤ RHS ≤ +1, and real energy eigenvalues are allowed. sin( x ) Now the zeroes of x occur at x = nπ where n = ±1, ±2, .... It is clear that a band of q-values, and corresponding energies are allowed near the zeroes of the RHS as indicated in Fig. 13.5 (left). Let us find an approximate solution for the first band E1 (k) by expanding the RHS for a large strength, or C = mSa >> 1 near the first zero h¯ 2 at n = 1, around qa = π. Using δ = π − qa, the expansion yields sin(qa) C cos(qa) + C · qa ≈ −1 + π δ, which when used in Equation 13.9 yields E1 (k) ≈ E0 − 2J (1 + cos ka), (13.10) where E0 =
π 2 h¯ 2 2ma2
coincides with the NFE energy at k = π 2 h¯ 4 2m2 a3 S
E0 C.
π a,
and the
”hopping” or tunneling term is J = = This is clearly in the form of a tight-binding model of Chapter 11! Now we really don’t
13.4 Point defects in Kronig–Penney models 275 12 8
10 8
6
6 4
4
2 2
0 -2 -10
-5
5
10
0 -1.0
-0.5
0.0
0.5
1.0
need to stop at the first root – expanding around qa = nπ, and retaining only the linear expansion terms, we get a more comprehensive tight-binding bandstructure of the nth band as: 2 1 (−1)n En (k) ≈ n2 E0 1 − + cos(ka) . C C
(13.11)
Fig. 13.5 shows a plot of the first three bands for the dimensionless strength C = 10. Note that the energy eigenvalues at the BZ edges co-incide with the free-electron values. This is similar to the case for S > 0 in the Kronig Penney model in Fig. 13.4 (Left).
13.4 Point defects in Kronig–Penney models Now imagine that in the Kronig–Penney model, only one of the N sites has a potential that is different from the other sites. Let us call this difference in the strength U0 , meaning at this particular site, the deltafunction strength is S + U0 instead of S, where U0 can be positive or negative. What is the effect on the energy eigenvalues and the eigenstates due to the presence of this ”defect”? This problem can now be solved because the exact solution of the Kronig–Penney model without the defect has given us the eigenvalues for each k-state in the BZ as EKP (k) – for example – shown in Fig. 13.4. Then, we go through exactly the same procedure that led to the Kronig–Penney solution in Equation 13.8, and end up with the new solution Na 1 =∑ , (13.12) U0 Ek − EKP (k) k
Fig. 13.5 The left figure shows a plot of the RHS of Equation 13.9 with x = qa, and the LHS is limited to −1 ≤ LHS ≤ +1. The narrow bands within which the two sides are equal are highlighted; each leads to an allowed energy band. Because the intersections are near x = nπ where n = ±1, ±2, ..., an approximate analytical expression of all the bands can be obtained (see Equation 13.11). This first three tight-binding bandstructures are plotted in the right panel. Compare with Fig. 13.4.
276 1, 2, 3 ... ∞: Pseudopotentials and Exact Bandstructure 30
4
20 2
10
Fig. 13.6 Figures showing the effect of defect states on the allowed energy eigenvalues as a function of the defect potential strength. The left figure shows the graphical solution to the Kronig–Penney type solution, and in particular illustrates the splitting off of one eigenvalue – the highest eigenvalue of the band for positive defect potentials, and the lowest energy eigenvalues for negative defect potentials. This is further highlighted in the figure on the right, where the eigenvalue spectrum is plotted as a function of the defect potential.
-10 -2
-20
-30
-6
-4
-2
2
4
6
-4
-3
-2
-1
1
2
3
where k are the allowed states in the first BZ, N is the number of lattice sites, and therefore Na = L is the macroscopic length. Clearly, in the absence of the defect, U0 → 0, and the LHS→ ∞. This happens exactly N times in the RHS when the allowed energies Ek = EKP (k ), i.e., we recover the original Kronig–Penney solution without the defect, as we should. But when U0 6= 0, the allowed energies Ek must deviate from EKP (k) to satisfy Equation 13.12. To illustrate the solution graphically, we plot the RHS and the LHS in Fig. 13.6. We will see in the next section that the RHS of Equations 13.12 and 13.8 are actually the Trace of the Green’s function matrix of the problem, i.e., ∑k E − E1 (k) = KP k Trace[ Gˆ ( E)]. The plot in Fig. 13.6 for a few-site chain shows the effect of the defect on the eigenvalue spectrum clearly. The figure on the right illustrates the movement of the eigenvalues as the strength of the defect U0 is tuned from zero to large positive and large negative values. The eigenvalues at U0 = 0 constitute the band without the defect. When the defect strength is +ve and strong, the LHS L/U0 line moves closer to the x-axis (left figure), and it is clear that one of the intersections – at the top of the energy band splits off from the band rapidly, whereas all other eigenvalues do not change as much. Any change is positive, as guaranteed by the Hellmann–Feynman theorem. This is a characteristic feature – similarly, for a negative U0 , the lowest eigenvalue of the band splits off and leaves other eigenvalues mostly unchanged. We will see later that U0 > 0 ”defects” explain the formation of acceptor states at the top of valence bands, and are designed such that the splitting energy is less than k b T for room-temperature generation of holes. Similarly, the bottom of the band with U0 < 0 models donor
13.5 Green’s functions from Kronig–Penney models 277
states and electron doping at the bottom of the conduction band of semiconductors. This exact model also explains the formation of deep level defects inside the bandgap of semiconductors.
13.5 Green’s functions from Kronig–Penney models We noted the repeated appearance of sums over the Brillouin zone of the kind ∑k E−1E(k) which have units of (energy)−1 . This may be thought of as a function of the variable E, or energy. The reason why such sums permeate exact solutions of problems will now become clear: and will lead us to define Green’s functions. Consider the Schrodinger equation ¨ i¯h
∂ ˆ → [i¯h ∂ − Hˆ ]Ψ = 0. Ψ = HΨ ∂t ∂t
(13.13)
Let us think of the equation as the product of the operator (or ma∂ ˆ or trix) i¯h ∂t − Hˆ with Ψ. For this product to be zero, either i¯h ∂t∂ − H, Ψ, or both must be zero. The only interesting case here is when we actually have a quantum object with a nonzero wavefunction, Ψ 6= 0. ∂ Thus, i¯h ∂t − Hˆ should be zero. Now we have learned that if the quan∂ tum object is in a state of definite energy, i¯h ∂t Ψn = En Ψn , Ψn , and En is a real eigenvalue representing the energy of the state. Let us gener∂ alize this and write i¯h ∂t = E, where E is a variable. We can then write the Schrodinger equation as [ EI − Hˆ ]Ψ = 0, where I is an identity op¨ erator, or the identity matrix when the equation is written out for any chosen basis. However, the equation in this form does not hold true for all E, but only for certain E = En – only when the variable E matches up with an allowed eigenvalue. Now let us think of EI − Hˆ as a function of E. When we vary E, this function has very sharp responses when E = En : the function is a ”detector” of eigenvalues – it detects an eigenvalue by vanishing. At those sharp energies, Ψ = Ψn 6= 0 is an eigenfunction, so the function provides the eigenfunction as its ”residue”. Now with this qualitative picture in mind, let us solidify the concept of the Green’s function of the system. We like detectors to ”scream” when they detect, rather than to go silent. So, can we find a function Gˆ that instead of solving the equation [ EI − Hˆ ]Ψ = 0, solves the equation [ EI − Hˆ ] Gˆ = I instead? Formally, the function is clearly Gˆ = [ EI − Hˆ ]−1 . This function clearly blows up when E = En , and is indeed the screaming detector we are looking for. ˆ Let us assume that It is the Green’s function for the Hamiltonian H. we know all the eigenvalues of a particular Hamiltonian Hˆ 0 to be En and the corresponding eigenfunctions are |ni. The Green’s function can then be written out as a matrix form: Gˆ 0 ( E) =
|nihn|
∑[EI − Hˆ ]−1 |nihn| = ∑ E − En . n
n
(13.14)
278 1, 2, 3 ... ∞: Pseudopotentials and Exact Bandstructure
It is clear that the Green’s function is actually a matrix, and sums of the kind that appeared earlier in the solution of the Kronig–Penney and the defect problems are the sum of the diagonal terms in a diagonal basis. Now it turns out that the sum of the diagonal terms is invariant to what basis is chosen for the matrix – which is why it goes by a name – the Trace. Thus, we have a very important relation Trace[ Gˆ ( E)] =
1
∑ E − E0 (k) ,
(13.15)
k
where E0 (k) are the allowed eigenvalues of the system. The solution of the Kronig–Penney model is thus very compactly written in the formal way as Trace[ Gˆ 0 ( E)] = Sa , where Gˆ 0 ( E) = ( EI − Hˆ 0 )−1 , and Hˆ 0 |ki = E0 (k)|ki are the nearly free–electron eigensystem, with h¯ 2 (k + G )2
E0 (k) = . The solution of a single-site defect state of strength 2m S0 is then written as Trace[ Gˆ ( E)] = Na S0 , where now the Green’s function is for the exactly solved Kronig–Penney eigensystem Trace[ Gˆ ( E)] = ( EI − Hˆ )−1 , where Hˆ |ki = EKP (k)|ki, and EKP (k) are the eigenvalues of the Kronig–Penney model. We can write the Green’s function in a non-diagonal basis as well. For the general basis states |ni and |l i we can write Gˆ 0 (l, l 0 ; E) =
∑hl [EI − Hˆ ]−1 |nihn|l 0 i = ∑ n
n
hl |nihn|l 0 i , E − En
(13.16)
which includes both diagonal and off-diagonal terms of the matrix. Since this form does not require us to start with a diagonal basis, it is preferred and is called the spectral representation of the Green’s function. For example, this can be applied to the nearly free–electron ”on a ring” of circumference L and lattice constant a. Since the eigenfunctions are h x |ni = is 1 Gˆ 0 ( x, x 0 ; E) = L Fig. 13.7 Conyers Herring developed the orthogonalized plane-wave (OPW) method in 1940s, quite a bit ahead of the times. He created the theoretical division of Bell Laboratories, and led and influenced several solid–state theorists and experimentalists. He contributed to the understanding of electron transport in semiconductors: the ionized impurity scattering theory is described in Chapter 23. He identified accidental band crossings, which have today developed into topological Weyl semimetals. Herring also introduced several semiconductor jargon, such as ”valleys”, and ”intervalley processes”.
√1 eik n x L
∑ n
of energy En = 0
eikn ( x− x ) 1 = E − En L
h¯ 2 k2n 2me ,
its Green’s function
Z +π ik ( x − x 0 ) a dk e n − πa
2π L
E−
h¯ 2 k2 2me
,
(13.17)
which depends on both the spatial dimension and energy. This form of the Green’s function is related to the local density of states.
13.6 Pseudopotentials: what they are and why they work The 3D version of the Dirac delta potential is V (r) = Sδ(r) where S has units of eV·cm3 . For any crystal lattice (e.g. FCC, HCP, etc), a 3D Dirac comb potential is formed by attaching a Dirac delta at each lattice point: V (r) = S ∑ δ(r − Rn ), where Rn = n1 a1 + n2 a2 + n3 a3 is a real space lattice vector. Assuming S to be negative, and a one atom
13.6 Pseudopotentials: what they are and why they work 279
basis, we can imagine it to be the attractive potential well introduced by the charges on the nucleus, and all the core electrons of that atom. For example, in a silicon crystal, there are four valence electrons, so the (nucleus + core electrons) have a net charge of Q = +4e. If the entire charge is assumed to be at the center of the nucleus (this is clearly a massive approximation!), the unscreened Coulomb potential will be V (r ) = −(4e2 )/4πe0 r, which is a deep potential well. The Dirac delta potential with a negative S is a very crude model of this potential. The Fourier transform of the 3D Dirac comb potential is simply another Dirac comb potential in the wavevector space: V (r) = S ∑ δ(r − Rn ) = S iGn ·r e =⇒ V (q) = ∑n VGn δ(q − Gn ), where Ω is the volume of ∑n Ω the BZ in real space, G0 s are the reciprocal lattice vectors, and VGn are the Fourier coefficients. The defining property of the Dirac comb potential is that all Fourier coefficients VGn are identical. From a perturbation theory perspective in the matrix eigenvalue representation, this means every electron kstate is coupled to every other k ± G state with the same strength, and all off-diagonal matrix elements take the same value. The exact bandstructure for the 3D Dirac comb potential therefore can be calculated by assuming the nearly free-electron states for the diagonal elements, and the Fourier coefficient of the Dirac potential as the off-diagonal elements. Exercise 13.2 discusses the modification to the nearly freeelectron bandstructure for the FCC lattice caused by a Dirac comb potential. In real crystals, the potential V (r ) experienced by electrons is not the faceless, shapeless forms that the Dirac delta function models. The real periodic crystal potential varies within the unit cell, and has the geometry and the symmetries of the crystal baked in. While the 3D Dirac comb potential can capture the symmetries of the lattice, it cannot capture the details of the atomic potentials. This detail is responsible for the rich variation of properties exhibited by crystals of different chemical compositions, such as say Si, Ge and GaAs. For example, consider the potential experienced by any of the 4 valence electrons in the outermost shell of Ge or Si due to all the core electrons and the nucleus. It may be approximated by the Coulomb potential Vactual (r ) = −(4e2 )/4πe0 r if all the net +4e charge were precisely a point charge located at the nucleus. Fig. 13.8 shows this schematically. But because the core electrons are not located at the nucleus, and extend out to an effective radius rc , the valence electrons experience an effective pseudo-potential and will be mostly unaware of the true V (r ) inside the core radius, but sensitive to V (r ) outside. Thus, to calculate the wavefunctions and energies of only the valence electrons, the exact potential and wavefunction are traded off for the smooth pseudopotential and the resulting smooth pseudo-wavefunction as shown in Fig. 13.8. If the core electrons were all located at the nucleus, the Fourier transform of the resulting Coulomb potential would take the shape of the well-known form V (q) = −(4e2 )/eq2 . But because of screening, the
Fig. 13.8 Concept of a pseudopotential and a pseudo-wavefunction near the nucleus of an atom. The central idea is that the deep core-level electrons experience the full Coulomb potential, whereas the valence electrons only experience an effectively screened pseudopotential, which can be used to construct a smooth wavefunction free of wiggles near the nucleus.
Fig. 13.9 James Phillips, who investigated chemical bonding and bands in solids, was one of the earliest to develop the pseudopotential bandstructure to understand semiconductors. He wrote an influential book called ”Bonds and Bands in Semiconductors”.
280 1, 2, 3 ... ∞: Pseudopotentials and Exact Bandstructure
4 One must admit that being able to con-
clude the energy eigenvalues and eigenfunctions of electrons over the entire Brillouin zone k-space resulting from the complex crystal potential with as few as three numbers is indeed quite remarkable.
1/q2 form is smoothed out as shown in Fig. 13.8. The minimum energy is roughly of the order of the Fermi energy for this potential, which would imply that the bandwidth of electron energies in the valence band should be of the order of EF = (h¯ 2 k2F )/(2me ) with 1 k F = (3π 2 nv ) 3 if the valence electrons of density nv were nearly free. In other words, a large part of the electron energies in the valence band will be close to the nearly free–electron bands. Some of the earliest experimental data measured for metals (e.g. Al, Cu), and silicon and germanium by X-ray Photoelectron Spectroscopy (XPS) showed that the energy bandwidth of the valence band was indeed about 2 EF ≈ (h¯ 2 (3π 2 nv ) 3 )/(2me ), and provided the first motivation for the pseudopotential bandstructure method. A crucial aspect is that since the atomic potential is periodic in the real space lattice, its Fourier components at the points qnm = |Gn − Gm | are the only ones that valence electron waves experience. As a consequence, the entire bandstructure is found by just knowing the values of the pseudopotential at those specific points Vpseudo (qnm ) as shown in Fig. 13.8. Since for large q the pseusopotential falls off fast as 1/q2 , the number of pseudopotential values needed to calculate the entire bandstructure can be as small as 3 for Si and Ge4 ! Though modern first principle methods give a method to calculate these pseudopotentials, they are best concluded from experimental measurements of optical spectra as will be described in Chapter 27. When the experimental data is used to find the pseudopotentials, this method is referred to as the Empirical Pseudopotential Method (EPM) for bandstructure calculation. Texts referred to at the end of this chapter provide more rigorous justification of the use of the pseudopotential technique. We proceed directly to demonstrate its use in calculating the bandstructures for a few semiconductors. The starting point for pseudopotential bandstructure calculation is familiar and simple: it is exactly the same nearly free-electron bandstructure of the empty lattice that was described in Chapter 10. The nearly free–electron eigenvalues are, as usual, E(k x , k y , k z ) =
h¯ 2 [(k x − Gx )2 + (k y − Gy )2 + (k z − Gz )2 ]. 2me
(13.18)
By choosing a set of G vectors with various (l, m, n), we create the pseudopotential matrix
h k − G1 |
| k − G1 i h¯ 2 2 2me ( k − G1 )
| k − G2 i
Vps (G1 − G2 )
| k − G3 i
Vps (G1 − G3 )
h¯ 2 hk − G2 | Vps (G2 − G1 ) 2m (k − G2 )2 Vps (G2 − G3 ) e h¯ 2 hk − G3 | Vps (G3 − G1 ) Vps (G3 − G2 ) 2m ( k − G3 )2 e ... ... ... ... ... ... ... ...
...
...
...
...
... , ... ... ... ... ... ... (13.19) where the diagonal elements are the nearly free–electron energies and ...
13.7
Bandstructure of Si, Ge, and GaAs 281
off-diagonal elements are pseudopotentials of the form Vps (Gi − G j ) = SS (Gi − G j ) · V S (|Gi − G j |2 ) +
iS A (Gi − G j ) · V A (|Gi − G j |2 ),
(13.20)
where SS (Gi − G j ) and S A (Gi − G j ) are the symmetric and antisymmetric parts of the structure factor that captures the lattice symmetries, and V S (|Gi − G j |2 ) and V A (|Gi − G j |2 ) are the symmetric and antisymmetric parts of the pseudopotential of the particular crystal, which captures the chemical nature. The pseudopotential electron states are linear combinations of the plane waves |ψi = ∑G cG |k + G i, and the problem is solved completely by evaluating the eigenvalues and eigenfunctions of the equation
c G1 2 h¯ (k − G2 )2 Vps (G2 − G3 ) ... ... c G2 Vps (G2 − G1 ) 2m e 2 Vps (G − G ) Vps (G − G ) h¯ (k − G )2 ... ... cG3 = 3 3 2 3 1 2me ... ... ... ... ... ... ... ... ... ... ... ... (13.21) which is solved for the eigenvalues E(k) that form the bandstructure of the semiconductor. Each energy eigenvalue corresponds to an eigenvector (cG1 , cG2 , ...) which gives the corresponding electron eigenfunction ψ(r) = ∑n cGn ei(k+Gn )·r . This eigenfunction may be plotted in real space to visualize the electron bonds, and also used to evaluate matrix elements required for transport or optical properties. The size of the matrix is determined by the number of plane waves based on the choices of G. For example, including reciprocal lattice vectors indexed as (n1 , n2 , n3 ) over the range (−3, −3, −3)...(0, 0, 0)...(+3, +3, +3) implies 73 distinct values of Gn , and therefore a 343 × 343 matrix5 . h¯ 2 2 2me ( k − G1 )
Vps (G1 − G2 )
Vps (G1 − G3 )
...
...
c G1 cG 2 E(k) c G3 , ... ...
13.7 Bandstructure of Si, Ge, and GaAs A large number of semiconductors belong to the FCC lattice structure of lattice constant a. The real space lattice vectors are a1 = 2a (xˆ + yˆ ), a2 = 2a (yˆ + zˆ ), and a3 = 2a (zˆ + xˆ ), and the volume of the unit cell is 3 Ωr = a4 . The corresponding reciprocal lattice vectors needed for the pseudopotential bandstructure calculations are then b1 = 2π a (1, 1, −1), 2π b2 = 2π ( 1, − 1, 1 ) , and b = (− 1, 1, 1 ) , which form the general FCC 3 a a reciprocal lattice vector 2π ( n1 + n2 − n3 , n1 − n2 + n3 , − n1 + n2 + n3 ). a (13.22) The nearly free–electron bandstructure for the FCC lattice along the L → Γ → X → U, K → Γ high symmetry path that was discussed G = n1 b1 + n2 b2 + n3 b3 =
5 The eigenvalues and eigenfunctions of
a matrix of this size is obtained within seconds or minutes on a modern personal computer. Because of this ease of computation, the reader is strongly encouraged to set up the following pseudopotential bandstructure examples on his/her own computer.
282 1, 2, 3 ... ∞: Pseudopotentials and Exact Bandstructure 10
10
10
5
5
5
-5
-5
-5
-10
-10
-10
-15
-15
-15
20
15
10
5
Fig. 13.10 Bandstructures of (a) nearly free–electron, and that of (b) Si, (c) Ge, and (d) GaAs calculated by the empirical pseudopotential method. Note the similarities in several areas of the semiconductor bandstructure to the nearly free–electron bandstructure far from the high symmetry points. In those sections of the bandstructure, it is as if the electron does not see the atoms! Note the energy axes range are the same for all the panels, though the absolute values are not. Table 13.1 Psedopotential parameters for cubic semiconductors in Ry = 13.6 eV units. Basic Semiconductor Physics by C. Hamaguchi provides tables of such pseudoptentials including those of Table 13.2.
˚ a (A) V3S V8S S V11 V3A V4A A V11
˚ a (A) V3S V8S S V11 V3A V4A A V11
˚ a (A) V3S V8S S V11 V3A V4A A V11
Si
Ge
GaAs
5.43 -0.21 +0.04 +0.08 0 0 0
5.66 -0.23 +0.01 +0.06 0 0 0
5.64 -0.23 +0.01 +0.06 +0.07 +0.05 +0.01
AlAs
InAs
GaP
InP
5.66 -0.221 +0.025 +0.07 +0.08 +0.05 -0.004
6.04 -0.22 0.00 +0.05 +0.08 +0.05 +0.03
5.44 -0.22 +0.03 +0.07 +0.12 +0.07 +0.02
5.86 -0.23 +0.01 +0.06 +0.07 +0.05 +0.01
InSb
GaSb
AlSb
6.48 -0.20 +0.00 +0.04 +0.06 +0.05 +0.01
6.12 -0.22 +0.01 +0.05 +0.06 +0.05 +0.01
6.13 -0.21 +0.02 +0.06 +0.06 +0.04 +0.02
in Chapter 10 is reproduced in Fig. 13.10 (a) for comparison with the pseudopotential bandstructures of various semiconductors. With the vectors r A = 8a (1, 1, 1) and r B = − 8a (1, 1, 1), the structure factor iG·r iG·r S A (G) = e 2 A and S B (G) = e 2 B lead to the pseudopotential matrix elements π Vps (Gi − G j ) = V S (|Gi − G j |2 ) cos( (n1 − m1 , n2 − m2 , n3 − m3 )) + 4 π A 2 iV (|Gi − G j | ) sin( (n1 − m1 , n2 − m2 , n3 − m3 )), 4 (13.23) where (n1 , n2 , n3 ) and (m1 , m2 , m3 ) define the reciprocal lattice vectors. The value of |Gi − G j |2 ranges from 0 to 27 in units of (2π/a)2 as the indices (n1 , n2 , n3 ) and m1 , m2 , m3 ) are varied from (0, 0, 0) to (±3, ±3, ±3). A majority of pseudopotential terms vanish due to the cos(...) and sin(...) terms in Equation 13.23; the smallest three that survive have |G|2 values of 3, 8, 11. The symmetric and antisymmetric pseudopotential coefficients for these surviving terms are listed in Table 13.1 in Rydberg ( = 13.6 eV) units. The strength of the pseudopotential changes with q2 = |G|2 roughly as was anticipated by the upward pointing arrows in Fig. 13.8. With Table 13.1 and Equation 13.21, the bandstructure of a large number of semiconductors of the FCC lattice structure are easily evaluated by the empirical pseudopotential method. The resulting bandstructures of silicon, germanium, and GaAs are shown in Fig. 13.10. Thick lines are valence bands and thin lines are conduction bands. The bandstructure of Si and Ge only needs three S ; the antisymmetric terms vanish bepseudopotentials V3S , V8S , and V11 cause both atoms in the basis are identical. The indirect bandgap na-
13.8
Bandstructure of AlN, GaN, and InN 283
ture of silicon and germanium are highlighted by the conduction band minima along the Γ − X direction for silicon and at the L-point for germanium. The valence bands of silicon and germanium appear nearly identical in shape over this energy scale. Note that for silicon, the energy separation between the lowest conduction band and the highest valence band is nearly the same along the entire k-path plotted: this leads to a rather large absorption peak for a photon of this energy. This optical critical point behavior will be discussed in Chapter 27. There is a X-point valence band crossing at ∼ 8 eV below the valence band maximum labeled at 0 eV for silicon and germanium. This degeneracy is lifted in the bandstructure of GaAs because the antisymmetric pseudopotentials are non-zero on account of the non-identical Ga and As atoms in the basis. It is highly instructive to compare the pseudopotential bandstructures of silicon, germanium, and GaAs in Fig. 13.10 with the nearly free–electron bandstructure of Fig. 13.10 (a). The lowest valence bands are nearly unchanged, except at the L point, as are some more valence bands. For semiconductors, most changes occur at the ∼ 15 eV Γ-point range: the pseudpotential terms split the large number of bands that cross there to open energy gaps. Some sections of the conduction bands at higher energies are also seen to be nearly unchanged from the nearly free–electron case.
We next consider the bandstructure of the nitride semiconductors GaN, AlN, and InN, which are used for visible and ultraviolet photonics, as well as for ultrafast and high-voltage electronics. These semiconductors are of the wurtzite crystal structure, with a hexagonal closepacked (HCP) lattice. The nearly free–electron bandstructure for the HCP lattice shown in Fig. 13.11 was discussed in Chapter 10√(Fig. 10.12). The real space lattice vectors are a1 = axˆ , a2 = a(√12 xˆ + 23 yˆ ), and a3 = czˆ , and the volume of the unit cell is Ωr = 23 a2 c. The ˆ − √1 yˆ ), corresponding reciprocal space lattice vectors are b1 = 2π a (x 3
√2 y, b2 = 2π ˆ and b3 = 2π ˆ , with unit cell volume (2π )3 /Ωr . The a c z 3 reciprocal lattice√vectors are then given by G = m1 b1 + m2 b2 + m3 b3 . Defining c = a/ u, the dimensionless ratio u = ( ac )2 is used to qchar8 acterize the semiconductor. For an ideal HCP structure, ac = 3 , or u = 38 . The equilibrium lattice constants of GaN, AlN, and InN differ from this ideal value. The square of the reciprocal lattice vector of indices (l, m, n) then is 2π 1 |G|2 = ( )2 [l 2 + (2m − l )2 + un2 ]. (13.24) a 3 The pseudopotential matrix elements are
iS A (Gi − G j ) · V A (|Gi − G j |2 ),
20
15
10
5
13.8 Bandstructure of AlN, GaN, and InN
Vps (Gi − G j ) = SS (Gi − G j ) · V S (|Gi − G j |2 ) +
25
(13.25)
Fig. 13.11 Nearly free–electron band˚ u = 0.377 HCP structure for a a = 3.18 A, lattice. Table 13.2 Psedopotential parameters for a few wurtzite semiconductors. V S (q)
V A (q)
GaN a1 a2 a3 a4
0.04258 13.079 0.226 20.393
0.5114 −20.122 0.0835 −41.557
AlN a1 a2 a3 a4
0.0363 11.960 0.234 22.233
0.0323 −145.212 0.0947 −19.160
InN a1 a2 a3 a4
0.0459 12.542 0.299 17.691
0.0221 −35.605 0.0574 −18.261
284 1, 2, 3 ... ∞: Pseudopotentials and Exact Bandstructure 10
10
10
5
5
5
-5
-5
-5
Fig. 13.12 Bandstructures of wurtzite AlN, GaN, and InN calculated by the empirical pseudopotential method.
where the HCP structure factors are l m n 2πnu SS = cos[2π ( + + )] cos[ ] 6 6 4 2 l m n 2πnu S A = cos[2π ( + + )] sin[ ], 6 6 4 2
(13.26)
and the pseudopotential is expressed as a function of wavevector q: V (q) =
0.4
0.2
0.0
-0.2
-0.4
-0.6
5
10
15
20
Fig. 13.14 Pseudopotential Matrix Elements for GaN, expressed as a continuous function of |G1 − G2 |2 .
a1 (2q2 − a2 )
1 + e a3 (2q
2 −a
4)
,
(13.27)
where the ai values are given in Table 13.2 individually for the symmetric and the asymmetric terms of the pseudopotential and plotted in Fig. 13.14. Note that pseudopotentials are retained for |G|2 < 16. The resulting bandstructures for AlN, GaN, and InN are shown in Fig. 13.12. All three nitrides are direct-bandap semiconductors. In addition, if the conduction bands are considered alone, the intervalley separation between the lowest Γ-point energy and the next lowest conduction band valleys are nearly 1.5 eV or more higher in energy; the comparative intervalley energy separation is quite a bit smaller in GaAs, Ge, and somewhat small in silicon. Compared to the cubic crystals of Fig. 13.10, the valence band widths are observed to be narrower for the nitride semiconductors. The highest (heavy-hole) valence band has far less curvature than the conduction band, highlighting the large heavy hole effective mass in the nitrides. It is again instructive to compare all three bandstructures of Fig. 13.12 with the nearly free–electron bandstructure of the HCP lattice in Fig. 13.11. The bandgaps and critical points in optical spectra of the nitride family of semiconductors are captured well by the empirical pseudopotential bandstructures. Fig. 13.13 shows the crystal structure
13.9 Pseudopotentials to DFT and beyond 285
GaNNbN
CBM
GaN
5 nm
3.42 eV 0 GaN
1 nm
GaN H M K
L
A
E (eV)
nm
VBM
-5
Γ M
K
Γ
K
M
GaN 0
DOS 2
Fig. 13.13 Experimentally measured valence bandstructure of GaN compared to the DFT calculation. The DFT calculation is nearly identical to the pseudopotential bandstructure shown in Fig. 13.12. The experimental measurement uses angle-resolved photoelectron spectroscopy (ARPES). Courtesy of V. Strocov, T. Yu, J. Wright, C. Chang, B. Pamuk, and G. Khalsa.
of GaN indicating the individual atoms arranged in the HCP lattice as imaged by transmission electron microscopy. The Brillouin zone is shown with the high symmetry points. A color figure of the experimentally measured valence band structure of GaN, and the density functional theory (DFT) calculated dashed lines are shown, along with the density of states of the valence band. Comparison of the empirical pseudopotential bandstructure and the DFT bandstructure indicate they are nearly identical for the valence band. The difference in the methods is the pseudopotential given by Equation 13.27 are empirically determined from comparison to experiment, whereas that of DFT is calculated from first principles. We give a brief discussion of the DFT method to wrap up the discussion on electron bandstructures of semiconductors.
13.9 Pseudopotentials to DFT and beyond Fermi and Thomas had introduced pseudopotentials as a means to approximate the distribution of valence electrons in atoms. The valence electrons are screened from the Coulomb potential of the nucleus and core electrons (this is called Thomas–Fermi screening), and the resulting Fermi energy becomes a function of the volume density of the valence electrons, i.e. EF = f (nv ). The pseudopotential methods, as well as DFT is traced back to this idea. Kohn and Hohenberg proved an important theorem that the ground-state energy of a quantum mechanical system governed by the Schrodinger equation is a function ¨ of the density function n(r ) of electrons, which varies in space as the square of the wavefunction n(r ) ∼ |ψ(r )|2 . In other words, to find the ground state energy we can get away without knowing the wavefunc-
286 1, 2, 3 ... ∞: Pseudopotentials and Exact Bandstructure
6 A functional is a function of a function.
tion, as long as we know the density. Because the energy is a function of the density function, it is a functional6 . In DFT, instead of determining the potential experienced by electrons by empirical methods, it is calculated ab initio by solving the Schrodinger equation using the ¨ Kohn–Hohenberg theorem. This method is computationally more intensive and therefore the number of atoms that can be simulated is restricted. Difficulties in obtaining accurate bandgaps of semiconductors have been addressed recently by the introduction of many-body corrections and hybrid functionals. The real power of the DFT methods are the host of physical properties that can be computed using it beyond the bandstructure, such as the total energy of the crystal, which is minimized at the equilibrium lattice constants and crystal structure, and the energetic preference of defects and dopants in the semiconductor crystals. Because bandstructures desired to simulate the operation of electronic and photonic devices need to be computationally efficient, often DFT is used to find accurate parameters for either pseudopotential, tight-binding, or k · p bandstructures. Once the parameters are determined, these techniques are far more nimble than DFT for device level simulations. This section marks the end of the extensive discussion of the computation of semiconductor bandstructures in the past few chapters. In the following chapters extensive use of the properties discussed will be made, both for bulk semiconductors and their quantized heterostructures. The recipe to capture the bands of doped semiconductors and heterostructures is the subject of the next Chapter (14). Before you leave this chapter, you will benefit by circling back to the discussion in Chapter 9, Section 9.2 on how the Bloch theorem led to all four recipes of bandstructure: (1) the nearly free–electron bandstructure (Chapter 10), (2) tight-binding bandstructure (Chapter 11), (3) k · p bandstructure (Chapter 12), and (4) exact and empirical pseudopotential bandstructure (this Chapter). The quest for a more perfect experimental measurement and computation and understanding of bandstructures continues to be at the forefront of semiconductor physics in particular, and condensed matter physics in general.
13.10 Chapter summary section In this chapter, we learned:
• The exact bandstructure for Dirac comb potentials, and its use to model defects, • The empirical pseudopotential method of bandstructure calculations, and • The hierarchy of various bandstructure calculation methods.
Further reading 287
Further reading Electronic Structure and Optical Properties of Semiconductors by M. L. Cohen and J. R. Chelikowsky, written by the original architects of the pseudopotential method for semiconductors, remains the gold standard for the subject. Fundamentals of Semiconductors by P. Yu and
M. Cardona has excellent discussion of bandstructure calculation methods in general, and the empirical pseudopotential method in particular. Basic Semiconductor Physics by C. Hamaguchi is a good resource for several pseudopotential bandstructure calculation details.
Exercises (13.1) Defect states in the Kronig–Penney model We have discussed about the Kronig–Penney crystal model, and how defect states cause energy eigenvalues to split off from band-edges. For simplicity, we are going to solve this problem for k = 0 (the Γ point), and neglect all other k-points. Let us say that a perfect Kronig–Penney crystal of lattice constant a has N eigenvalues Ei with i = 1...N in a band at Γ. All other energy bands are very far in energy, and may be neglected. If at one of the lattice sites, a defect changes the delta function strength by U0 , then the new exact eigenvalues are given by solving N a 1 = Trace[ G ( E)] = ∑ , (13.28) U0 E − Ei i =1 for allowed energies E, where U0 is in eV-nm units, G ( E) is the Green’s function, the Trace of which is just the inverse sum on the right.
(a) Argue why the equation above is correct if no defect is present. (b) Show graphically that if the eigenvalues themselves are widely separated, and U0 >> 0, the eigenvalue that splits off the band due to the defect has energy Es+ ≈ EN + Ua0 . (c) Using your graph of part (b), show that if U0 W
(14.29)
V ( x, y, z) = −∆Ec , 0 ≤ z ≤ W.
(14.30)
Using the effective mass equation with this potential, it is evident that the envelope function should decompose as 1 Cnz ( x, y, z) = φ( x, y)χnz (z) = [ √ ei(k x x+ky y) ] · [χnz (z)]. A
(14.31)
If the quantum well is assumed to be infinitely deep, by simple wavefitting procedure the z-component of the electron quasi-momentum is quantized to the values4 π k nz = nz , (14.32) W where nz = 1, 2, 3, . . .. From simple particle-in-a-box model in quantum mechanics, the normalized z-component of the envelope function is r 2 πnz z χnz ( z ) = sin . (14.33) W W
4 Only waves that satisfy n (λ/2) = W z
fit into the well of width W, leading to k nz = 2π/λ = (π/W )nz .
300 Doping and Heterostructures: The Effective Mass Method
The bandstructure is the set of energy eigenvalues obtained from the effective mass equation, given by E(k) = Ec0 +
k2y h¯ 2 k2x h¯ 2 πnz 2 ( ? + ? )+ ( ) , 2 m xx myy 2m?zz W | {z } | {z }
(14.34)
E1D (nz )
E2D (k x ,k y )
which evidently decomposes to a free-electron component in the x − y plane and a quantized component in the z-direction. The bandstructure consists of multiple bands E2D (k x , k y ), each indexed by the quantum number nz ; this is shown in Fig. 14.6. The DOS of electrons confined in an ideal 2D plane is a constant, given by g2D ( E) = m? /π¯h2 (from Section 5.4). In the quantum well, each subband corresponding to an nz is an ideal 2-D system, and each subband contributes g2D ( E) the total DOS. This is shown schematically in Fig. 14.6. Thus, the DOS of the quantum well is gQW ( E) =
m? π¯h2
∑ θ (E − Enz ),
(14.35)
nz
where θ (. . .) is the unit step function. The carrier density in the first subband of an ideal 2D electron system is thus given by n2D =
Z ∞ 0
dE f FD ( E) g2D ( E) =
m? k B T
h2 } | π¯ {z
ln(1 + e
EF − E1 kB T
),
(14.36)
NC2D
where E1 is the ground state energy, EF is the Fermi level, and NC2D is the effective band-edge DOS, the 2-dimensional counterpart of NC3D defined in Equation 14.24. Several examples were given earlier in Chapter 5 and elsewhere. For the quantum well, which houses many subbands, the DOS becomes a sum of each subband (Fig. 14.6), and the total carrier density is thus a sum of 2D-carriers housed in each subband – n2D =
∑ n j = Nc2D ∑ ln(1 + e j
EF − E j kB T
).
(14.37)
j
Note that for a 2D system, no approximation of the Fermi–Dirac integral is necessary, for the carrier density is obtained analytically. It is important to note that the confining potential in the z-direction can be engineered almost at will by modern epitaxial techniques by controlling the spatial changes in material composition. For example, a popular quantum well structure has a parabolic potential (V (z) ∼ z2 ), which leads to the Enz values spaced in equal energy intervals – this is a characteristic of a square, or harmonic oscillator potential. Another extremely important quantum well structure is the triangular well potential (V (z) ∼ z), which appears in MOSFETs, HEMTs, and
14.6
Fig. 14.7 Bandstructure, and DOS of realistic quantum wires.
quantum wells under electric fields. The triangular well leads to Enz values given by Airy functions. Regardless of these details specific to the shape of the potential, the bandstructure and the DOS remain similar to the square well case; the only modification being the Enz values, and the corresponding subband separations.
14.6
1D quantum wires
Artificial quantum wires are formed either lithographically (top-down approach), or by direct growth in the form of semiconductor nanowires or nanotubes (bottom-up approach). In a quantum well, out of the three degrees of freedom for real space motion, carriers were confined in one direction, and were free to move in the other two directions. In a quantum wire, electrons are free to move freely in one direction only (hence the name ”wire”), and the other two degrees of freedom are quantum-confined. Assume that the length of the wire (total length Lz ) is along the z-direction (see Fig. 14.7), and the wire is quantumconfined in the x − y plane (L x , Ly Lz ). Then, the envelope function naturally decomposes into 1 C ( x, y, z) = χnx ( x ) · χny (y) · ( √ eik x x ), Lz
(14.38)
and the energy eigenvalues are given by E(n x , ny , k z ) = E(n x , ny ) +
h¯ 2 k2k . 2m?zz
(14.39)
If the confinement in the x − y directions is by infinite potentials (a useful approximation applicable in many quantum wires), then similar
1D quantum wires 301
302 Doping and Heterostructures: The Effective Mass Method
to the quantum well situation, a wave-fitting procedure gives k nx =
π π n x and k ny = ny , Lx Ly
(14.40)
where n x , ny = 1, 2, 3, . . . independently. The eigenfunctions assume the form r
s
πny 2 1 sin( y)] · [ √ eik x x ], Ly Ly Lz (14.41) and the corresponding bandstructure is given by Cnx ,ny ( x, y, z) = [
2 πn x sin( x )] · [ Lx Lx
h¯ 2 πn x 2 h¯ 2 πny 2 h¯ 2 k2 E(n x , ny , k z ) = [ ( ) ]+[ ( ) ] + ?z . 2m xx L x 2myy Ly 2mzz | {z }
(14.42)
E(n x ,ny )
Multiple subbands are formed, similar to the quantum well structure. A new subband forms at each eigenvalue E(n x , ny ), and each subband has a dispersion E(k z ) = h¯ 2 k2z /2mzz (Fig. 14.7). The DOS of electrons confined to an ideal 1D potential is given by (from Section 5.3), s 1 2m? 1 √ g1D ( E) = , (14.43) 2 π E − E1 h¯ where E1 is the lowest allowed energy (ground state). Due to multiple subbands, the DOS acquires peaks at every eigenvalue E(n x , ny ). Since there are two quantum numbers involved, some eigenvalues can be degenerate, and the peaks can occur at irregular intervals as opposed to the quantum well case. The general DOS for a quantum wire can thus be written as s 1 2m? 1 q gQWire ( E) = , (14.44) ∑ 2 π h¯ nx ,ny E − E(n , n ) x
y
which is shown schematically in Fig. 14.7.
14.7
0D quantum dots
The quantum dot is the ultimate nanostructure. All three degrees of freedom are quantum confined; therefore there is no plane-wave component of electron wavefunctions. The envelope function for a ”quantum box” of sides L x , Ly , Lz (see Fig. 14.8) is thus written as C ( x, y, z) = χnx ( x )χny (y)χnz (z),
(14.45)
14.7
0D quantum dots 303
Fig. 14.8 Energy levels and DOS of quantum dots.
and if the confining potential is infinitely high, we have k ni = (π/Li )ni for i = x, y, z. The envelope functions are thus given by
C ( x, y, z) = [
r
2 πn x sin( )] · [ Lx Lx
s
πny 2 sin( )] · [ Ly Ly
and the energy eigenvalues are given by E(n x , ny , nz ) =
r
2 πnz sin( )], Lz Lz (14.46)
h¯ 2 πn x 2 h¯ 2 πny 2 h¯ 2 πnz 2 ( ) + ( ) + ( ) . (14.47) 2m xx L x 2myy Ly 2mzz Lz
Note that the energy eigenvalues are no more quasi-continuous, and are indexed by three quantum numbers (n x , ny , nz ). Thus, it does not make sense to talk about ”bandstructure” of quantum dots; the DOS is a sum of delta functions, written as gQDot =
∑
n x ,ny ,nz
δ( E − Enx ,ny ,nz ).
(14.48)
This is shown schematically in Fig. 14.8. Since there is no direction of free motion, there is no transport within a quantum dot, and there is no quasi-continuous momentum components5 . Fabricating quantum dots by lithographic techniques is pushing the limits of top-down approach to the problem. On the other hand, epitaxial techniques can coax quantum dots to self-assemble by cleverly exploiting the strain in lattice-mismatched semiconductors. On the other hand, bottom-up techniques of growing nanocrystals in solution by chemical synthetic routes is becoming increasingly popular.
5 The sharp energy levels of atoms in
the periodic table are fixed, but those in quantum dots are up to us to design: they are artificial atoms!
304 Doping and Heterostructures: The Effective Mass Method
14.8 Finite barrier heights
0.4
0.3
0.2
0.1
0.0
5
10
15
20
Fig. 14.9 shows how the eigenvalues change for a finite barrier height for quantum wells, in direct comparison to the eigenvalues for the ∞barrier height case. The solutions are obtained directly by enforcing the continuity of the wavefunction and its derivative at the two walls of the barrier. The procedure is discussed in Exercise 14.2. Panel (a) shows a schematic depiction of the finite quantum well, with the expected bound state eigenvalues and the corresponding eigenfunctions. Figs. 14.9 (b) and (c) show the eigenvalues with the solid lines as the finite barrier height well eigenvalues E1 , E2 , ..., and the dashed lines the ∞-well values E1∞ , E2∞ , .... The striking features are (a) the drop in energy eigenvalues with increasing quantum well thickness, (b) the energy eigenvalues for a finite well are smaller than the ∞-barrier well approximation, (c) the eigenvalues for finite well heights gradually approach that of ∞-well, and (d) that the finite wells have a finite number of bound states, unlike the ∞-well. The lower energy eigenvalues are expected because of the penetration of the electron wavefunction into the barrier by tunneling. Because of this, the effective width of the finite well is similar to a thicker ∞-well. The finite number of bound states are also expected because any electron energy larger than the barrier height is a free state that is a part of the continuum with planewave solutions. For example, for Lw = 15 nm with U0 = 0.3 eV barrier height and m? = 0.067me representative of the conduction band effective mass of GaAs, there are four bound states in the quantum well.
1.0
0.8
14.9 Multilayers and superlattices
0.6
0.4
0.2
0.0 0.0
0.2
0.4
0.6
0.8
1.0
Fig. 14.9 Eigenvalues of a quantum well with finite barrier height. m? = 0.067me is assumed, and (c) is for a fixed Lw . 6 The following numerical calculations in
this section are performed using a selfconsistent solution of the Schrodinger ¨ equation and the Poisson equation (see Section 15.6).
Consider two quantum wells located close to each other in space. If the potential barrier separating them has an ∞ height but finite thickness, the states in the two wells are completely isolated from each other because the wavefunction must go to zero when the potential goes to ∞. This is necessary for the terms in the Schrodinger equation to remain ¨ finite. But a finite barrier height allows for the wavefunction from one well to ”leak” into the other. This property is heavily used in semiconductor quantum structures called multilayers and superlattices. Fig. 14.9 (a) indicated the penetration of the envelope wavefunction into the finite barriers. Consider now6 in Fig. 14.10 (a) the numerically calculated conduction band edge Ec for an Al0.2 Ga0.8 As / GaAs / Al0.2 Ga0.8 As heterostructure with qφB = 0.6 eV barrier heights at the left and right ends. The GaAs quantum well is Lw = 4 nm thick and is located at the center. The Al0.2 Ga0.8 As layers are doped for 50 nm at Nd = 1017 /cm3 around the quantum well. For Figs. 14.10 (b – f) the same Al0.2 Ga0.8 As layer scheme is used, but the number of quantum wells is increased from N = 1 to N = 10. The barrier layers separating the quantum wells are Al0.2 Ga0.8 As. Both the quantum wells and the
14.9
barriers separating them are undoped. Fig. 14.10 (a) shows the energy eigenvalue E1 for the lowest energy bound state in the quantum well in the z-direction, and the corresponding envelope function C1 (z) where the penetration to the barriers is seen. In such heterostructures typically the x − y direction is uniform and the electron envelope function is a 2D plane wave in the x − y plane, but this point is not the subject of the following discussion. Here we focus primarily on the potential for motion along the z-direction by coupling between the adjacent wells. Fig. 14.10 (b) shows the situation for N = 2 quantum wells. The lowest eigenvalues now split into two: E1 and E2 , and the corresponding envelope functions C1 (z) takes a ”bonding’, and C2 (z) an ”antibonding” character. Note that C2 (z) has one zero-crossing. Also note that for N = 2 wells the envelope function C1 (z) for the lower energy state looks like the sum of the envelope functions for the N = 1 case but centered at each well. Similarly C2 (z) for N = 2 wells looks like the difference of the envelope functions of the N = 1 case centered at each well. Notice the analogy to Fig. 11.3 for the corresponding states in atoms and molecules. The N = 2 coupled quantum well exhibits a similar feature as individual atoms, except that here it is for a quantum well in a semiconductor crystal: the envelope functions C1 (z) and C2 (z) have to be multiplied by the periodic Bloch functions. The lateral x − y plane wave motion is another major difference from the atomic situation. Figs. 14.10 (c – f) show successively larger number of coupled quantum wells. The envelope function Ci (z) corresponding to the Eith eigenvalue has i − 1 nodes. The reader is encouraged to examine the envelope functions and draw analogies to the tight-binding models that were described in Chapter 11, Fig. 11.6, while keeping in mind that here we are discussing envelope functions. For the N = 10 case shown in Fig. 14.10 (f), the C10 (z) envelope function has 9 nodes. The lower energy states for N = 10 are sufficiently delocalized in the z-direction. This implies they can move relatively freely between the quantum wells, but are still bound by the larger barriers outside of the wells. There are additional energy eigenvalues that have energies larger than the barrier height of the quantum wells. These states are bound by the wider barriers formed between the cladding Al0.2 Ga0.8 As layers. Since the electron in these states can move over a larger z-distance than those inside the quantum wells, their energies will be spaced closely. Furthermore, it will be noted that as as the number of coupled quantum wells N increases, more energy states are clumped into a finite energy width. Since this fact is not that clear in Fig. 14.10, the N lowest energy eigenvalues calculated for N coupled quantum wells are plotted separately in Fig. 14.11 for N = 1 to N = 16 wells. It is now seen explicitly that, for example, the maximum minus minimum energy eigenvalues approaches a finite energy width as N increases. Imagine now that the number of coupled quantum wells increases to a large number N → ∞. In that case, the energy eigenvalues become
Multilayers and superlattices 305 0.6
0.2
0.5
0.1
0.4 0.3
0.0
0.2
-0.1
0.1
-0.2
0.0 -0.1
1000
-0.3 3000
2000
0.6
0.2
0.5
0.1
0.4 0.3
0.0
0.2
-0.1
0.1
-0.2
0.0 -0.1
1000
2000
3000
0.6
-0.3 0.2
0.5
0.1
0.4 0.3
0.0
0.2
-0.1
0.1
-0.2
0.0 -0.1 0.6
1000
2000
3000
0.5
-0.3 0.2 0.1
0.4 0.3
0.0
0.2
-0.1
0.1
-0.2
0.0 -0.1
1000
2000
3000
0.6
-0.3 0.2
0.5
0.1
0.4 0.3
0.0
0.2
-0.1
0.1
-0.2
0.0 -0.1
1000
2000
3000
0.6
-0.3 0.2
0.5
0.1
0.4 0.3
0.0
0.2
-0.1
0.1
-0.2
0.0 -0.1
1000
2000
3000
-0.3
Fig. 14.10 Energy band diagrams of the conduction band of AlGaAs/GaAs heterostructures. The number of quantum wells range from N = 1 to N = 10. The energy eigenvalues Ei and corresponding eigenfunctions ψi (z) are shown.
306 Doping and Heterostructures: The Effective Mass Method
50
40
30
20
10
-10 0
2
4
6
8
10
12
14
16
Fig. 14.11 Lowest energy eigenvalues as a function of the number of coupled wells.
Fig. 14.12 Minibands formed in a semiconductor superlattice, and the corresponding energy dispersion in the z direction. Such superlattices are typically used in semiconductor electronic or photonic devices for carrier transport
very closely spaced. Furthermore, similar to the tight binding model of Fig. 11.6 for the s-band, there is a energy band formed for motion in the z-direction by the combination of the ground states of the wells. This is called the miniband, and this crystal is called a superlattice. Notice that this superlattice has a lattice constant along the z-direction that is a multiple of the lattice constant of the semiconductor crystal itself. Therefore, as opposed to the lattice constant of the semiconductor crystal (e.g. GaAs or Al0.2 Ga0.8 As here), the effective lattice constant of the miniband is Lw + Lb , the sum of the well and barrier thicknesses. Thus, the Brillouin-zone edges of the artificial superlattice for motion along the z-direction is therefore −π/( Lw + Lb ) ≤ k z ≤ +π/( Lw + Lb ). There can be a superlattice miniband formed for each energy eigenvalue of the quantum well. For example, if the single well allowed a second bound state, this state could couple more strongly with nearby wells because of a smaller barrier height. Therefore, the second miniband with larger bandwidth and smaller effective mass can be formed, as is indicated schematically in Fig. 14.12.
14.10 Wannier functions Because Bloch eigenstates ψk (r) = eik·r uk (r) are extended over the entire crystal, to handle carriers that are confined to fixed regions in space, in this chapter we have developed linear combinations of the Bloch k-states to create the effective mass wavefunction ψ(r) ≈ C (r)uk (r). Since the envelope function C (r) is a slowly varying function, we realized that it is really the solution to the band-edge profile, and it can handle quantum confinement in a simple, transparent, and intuitive way. In fact that is the reason the effective mass theory is so popular as a design tool. Instead of extended Bloch functions, to which we arrived at by starting from free–electrons, one can ask if there is a complete basis set of mutually orthogonal wavefunctions that are localized on atomic sites. The idea here is similar in spirit to the tight-binding approach to bandstructure of approaching electronic states from the molecular orbitals tied to each atom, rather than the nearly free–electron model, or the pseudopotential models. Such a localized, orthogonal, and complete set of functions was introduced by Wannier. The Wannier function wn (r − R) of the nth band tied to the lattice site R is related to the Bloch function in the following way: 1 1 wn (r − R ) = √ e−ik·R ψnk (r) = √ ∑ ∑ eik·(r−R) unk (r), N k∈BZ N k∈BZ (14.49) which is a summation over k-states in the entire BZ. The periodic part of the Bloch function is related to the Wannier function via 1 unk (r) = √ ∑ e−ik·(r−R) wn (r − R). (14.50) N R
14.11
Chapter summary section 307
Fig. 14.13 Real parts of Wannier functions compared to Bloch functions.
Fig. 14.13 highlights the difference between Bloch and Wannier functions. While the Bloch functions are extended, the Wannier functions are localized at atomic sites. Though Bloch states are energy eigenvalues for the electron in the crystal, the Wannier states are not. The reason we are introducing the Wannier states here is to highlight and contrast them with both the Bloch states, and also the effective mass states7 . Effective mass states are linear combinations of Bloch states, and are also linear combinations of Wannier states. Since the bandstructure E(k) is known from experiments, the Bloch states are used as the basis for effective mass theory. However, for certain physical phenomena, the Bloch states are not well suited. An example of such phenomena is the presence of spontaneous and piezoelectric polarization in semiconductors. The reason is that such phenomena classically are thought of as charge dipoles localized in every unit cell, and combinations of Bloch functions cannot easily provide something that is localized. The Wannier function is superior to explain such phenomena.
14.11 Chapter summary section In this chapter, we learned:
• How the effective mass theory maps the complex problem of electrons in periodic or quantum confined potentials into the simplest problems of quantum mechanics, such as the hydrogen atom or the various particle in a box problems, and • How to exploit this connection to understand, and design semiconductor doping, and heterostructure based quantum wells, wires, and dots.
7 An additional significance of Wan-
nier functions is that they can be defined separately for each band, whereas wavepackets from Bloch functions require basis elements from multiple bands.
Fig. 14.14 Gregory Wannier introduced the concept of spatially orthogonal wavefunctions in crystals.
308 Exercises
Further reading The Physics of Low Dimensional Semiconductors by Davies provides an excellent treatment of the effective mass method, and its applications to several quantized heterostructures with beautiful illustrations and exam-
ples. Wave Mechanics Applied to Semiconductor Heterostructures by Bastard is a classic text that explains several intricacies of the quantum mechanics of heterointerfaces using effective mass methods.
Exercises (14.1) Effective mass methods: doping and quantum dots We derived the effective mass approximation in this chapter, in which the complicated problem of a free–electron in a periodic potential was mapped to a much simpler problem of an electron with an effective mass in a free potential. The key step was to create a wavepacket by constructing a linear combination of Bloch ψk (r ) = uk (r )eikr in the R eigenstates dk form φk (r ) = 2π C (k)uk (r )eikr , and using Fourier transform properties with the Hamiltonian operator. The effective mass equation in the presence of perturbations to the periodic crystal potential then is written as
is an operator obtained from the bandstructure En (k) of band n by replacing k → −i ∇. W is a time-dependent, or time-independent perturbation to the periodic potential of the crystal. Instead of the Bloch functions ψk (r ), we can now work with the envelope functions C (r ), remembering that the wavefunction of the wavepacket is the product of the envelope function and the periodic part of the Bloch-function, i.e., φ(r ) = C (r )uk (r ). For this problem, consider the conduction band with a parabolic bandstructure characterized by an effective mass m?c and band-edge Ec , such that 2 2 Ec (k) = Ec + h¯2mk? . c
(a) Show that the effective mass equation is then
[−
Fig. 14.15 Walter Kohn with Joaquin Luttinger developed the effective mass theory for semiconductors. The effective mass theory arms us with a powerful method to design semiconductor nanostructures and enormously simplifies the design of quantum wells, wires, and dots. Kohn was awarded the Nobel Prize in 1998 for the development of density functional theory (DFT).
[ En (−i ∇) + W ]C (r ) = EC (r ),
(14.51)
where C (r ) is the envelope function, and En (−i ∇)
h¯ 2 2 ∇ + W ]C (r ) = ( E − Ec )C (r ). 2m?c
(14.52)
Note that this is in the form of a modified Schrodinger equation, and is referred to as the ¨ effective mass Hamiltonian. Show that the solutions in the absence of the perturbation are simple plane waves, Ck (r ) = √1 eikr . Find k. V
(b) Doping: When we introduce a dopant atom in the semiconductor (see Fig. 14.16), the perturbation due to a single donor atom in a 3Dimensional crystal semiconductor is a Coulomb e2 potential W (r ) = − 4πe with es the dielectric sr constant of the semiconductor. Argue that this effective mass problem maps exactly to the hydrogen atom problem, and show that the energy of the shallow donors are En = Ec − Ry? /n2 , where (m? /m ) Ry? = (e c/e )e2 Ry0 , where Ry0 = −13.6 eV is the s
Exercises 309 ground state energy of an electron in the hydrogen atom. Also show that the radius of the donor electron is modified from the hydrogen electron radius 0 to a?B = me?s /e a0 , where a0B = 0.53 Angstrom is the c /me B Bohr radius of the electron in the hydrogen atom. Estimate the ionization energy of the donor, and the radius in a semiconductor with es = 10e0 and m?c = 0.1me . From these considerations, argue why bands with heavy effective masses may be difficult to dope.
+
Fig. 14.16 Schematic representation of a charged donor, its activation energy ED and the envelope function of the donor electron C (r ).
(c) Quantum Dots: Suppose we have a narrow bandgap semiconductor quantum dot of size L x = Ly = Lz = L embedded in a wide-bandgap semiconductor matrix. Assume the conduction band offset ∆Ec and the valence band offset ∆Ev are very large, such that an electron in the conduction band and holes in the valence band of the quantum dot effectively see infinitely tall barriers. Find the allowed energies of the electrons and hole states in the quantum dot as a function of the dot size L, and the conduction and valence band effective masses m?c and m?v . If the bulk bandgap of the narrow bandgap semiconductor is Eg , what is the energy of a photon that will be emitted if an electron transitions from the CB ground state to the VB ground state? Make a plot of the emitted photon energy as a function of the quantum dot size from 1 nm ≤ L ≤ 10 nm, for the following parameters of the narrow bandgap semiconductor: m?c = m?v = 0.1me , Eg = 0.8 eV. Fig. 14.17 shows how such quantum confinement is used to create photons of large energy for deep ultraviolet LEDs and lasers.
Fig. 14.17 A quantum dot UV light-emitting diode realized by blue-shifting the emission from GaN by quantum confinement.
(14.2) Quantum well heterostructures The finite quantum well problem is the basis of all quantized structures based on compound semiconductor heterostuctures. In this problem you evaluate some examples to gain insight, and collect some very useful formulae for the quantum design of heterostructure devices. (a) With relevant formulae and sketches, outline a graphical method for identifying the bound state eigenvalues and eigenfunctions in a finite quantum well of height U0 and width Lw for a quantum well semiconductor material with effective mass m? . Show thatqthe solution for allowed q 2 k values take θ02 θ0 the form − 1 = tan θ and − 1 = − cot θ, θ2 θ2 kLw where θ = 2 , and the characteristic constant θ02 =
m? L2w U0 . 2¯h2
(b) Show that in the case of a vanishingly small barrier height U0 → 0, there is still at least one bound state for the 1D quantum well with a binding energy equal to U0 − E1 ≈ θ02 U0 . (c) Show that the number of bound states is
310 Exercises 2θ0 ], where Int[ x ] is the largest inteπ ger smaller than x. Show q that the numerical value ? U0 w is N = 1 + Int[1.63( 1 Lnm ) (m m0 ) · ( 1 eV )]. N = 1 + Int[
(d) Now consider the electron states in the conduction band of a heterostructure quantum well shown in Fig. 14.6, with U0 = ∆Ec = 0.3 eV, and m?c = 0.067me . How many bound states does a well of thickness Lw = 5 nm hold? Write the effective mass wavefunctions C (r ) for electrons in the ground state of the quantum well, and find its characteristic penetration depth into the barrier layer. (e) Find the Fermi level in the quantum well if we fill the quantum well with electrons of 2D density ns = 1012 /cm2 .
(14.3) A heterostructure step barrier
for an electron incident from the left with kinetic h¯ 2 k2 energy E = 2m?l is c
T ( E) =
where k b =
1 1+
q
(∆Ec )2 4E( E−∆Ec )
· sin2 (k b a)
,
(14.53)
2m?c ( E − ∆Ec ). h¯ 2
Solution: Let the electron wavevectors in the left, barrier, and right regions be k l , k b and kr respectively. The wavefunctions are respectively ψl ( x ) = eikl x + rl e−ikl x , ψb ( x ) = tb eikb x + rb e−ikb x and ψr ( x ) = tr eikr x . Equating the wavefunctions and their 1st derivatives at x = 0, ψl (0) = ψb (0), 0 0 ψl (0) = ψb (0) and at x = a, ψl ( a) = ψb ( a), 0 0 ψl ( a) = ψb ( a) gives 4 equations in the 4 unknowns rl , tb , rb , tr . Note that each of these coefficients are in general complex quantities. Solving the 4 equai (k b −kr ) 2ik b a tions gives rl = rbl −rbr2ikeb a , tr = tbl e 2ikb a , tb = rbl rbr e −1 rbl rbr e −1 tl tl k b −k l 1 −2ik b a . Here r bl = k + k , 2ik b a , and r b = b l 1−rbl rbr e rbl − r e br 2k l 4k b k l k b −kr rbr = k +kr , tl = k +k , and tbl = (k +k )(k +k ) . b b l r b l b
From the amplitudes, the transmission probability across the barrier is given by T ( E) = |tr |2 = 1+1 x , the same as Equation 14.53. (b) Find the reflection probability R( E) as a function of the kinetic energy of the electron, and find an approximation for small barrier heights.
Fig. 14.18 A heterostructure step barrier.
Fig. 14.18 shows a step barrier for electrons formed at a heterojunction between two semiconductors of different bandgaps. (a) Consider the 1D situation and the conduction band electron effective mass m?c to be identical in all regions. Show that the transmission probability
Solution: Since an electron incident from the left is either reflected or transmitted across the barrier layer to the right, the reflection probability and transmission probability must add to unity: R( E) = 1 − T ( E) =
1 1+
4E( E−∆Ec ) (∆Ec )2
· csc2 (k b a)
.
(14.54) This problem will be taken up in Chapter 24 since it is central to tunneling through barriers.
Carrier Statistics and Energy Band Diagrams
15
Once a particular semiconductor structure is realized, the electron energy bandstructures E(k) are locked in. As opposed to E(k) which is the solution of the quantum mechanical energy eigenvalues of electrons in the k-space, energy band diagrams tell us how the allowed electron energies vary in real space. The atomic composition of the semiconductor heterostructures, and the chemical doping determine how the mobile electron and hole concentrations, and the energy bands change in real space. In other words, they determine how the energy band edges around the bandgap Ec (r) and Ev (r) vary in real space. In this chapter, we develop the following core concepts essential towards controlling and manipulating electrons and holes in semiconductor electronic and photonic devices:
• What are the mobile carrier concentrations in an intrinsic, pure semiconductor? • How can we vary the mobile carrier concentrations in semiconductors by factors of ∼ 1010 by doping? • What are the carrier statistics and energy band diagrams in homojunctions: when the semiconductor is the same, but the doping varies in space? • What are the carrier statistics and energy band diagrams in heterojunctions: when both the semiconductors and the doping vary in space?
15.1 Carrier statistics Consider first a large piece of an intrinsic semiconductor. Assume it has no donor or acceptor doping. How many mobile electrons and holes can I measure in it at room temperature? We have already learned the carrier statistics relations: that the electron and hole densities in 3D are n = Nc F 1 ( 2
EF − Ec Ev − EF ) and p = Nv F 1 ( ), 2 kb T kb T 2πm? k T
3
(15.1) 2πm? k T
3
where Nc = Nc3d = gs gvc ( hc2 b ) 2 and Nv = Nv3d = gs gvv ( hv2 b ) 2 . Note that EF is the Fermi level, Ec is the conduction band edge, and
15.1 Carrier statistics
311
15.2 EF is constant at thermal equilibrium 320 15.3 Metal-semiconductor junctions
Schottky 320
15.4 p-n homojunctions
323
15.5 Heterojunctions
325
15.6 Energy band diagrams: ¨ son+Schrodinger 15.7 Polarization-induced heterostructures
Pois327
doping
in 330
15.8 Chapter summary section
331
Further reading
332
Exercises
333
312 Carrier Statistics and Energy Band Diagrams
Ev is the valence band edge. The bandgap is Eg = Ec − Ev . The spin degeneracy is gs = 2. The valley degeneracy will depend on the bandstructure E(k) around the bandgap. For the simplest case, assuming there is only one conduction band minimum and one valence band 2πm? k T 3 maximum, gvc = gvv = 1, the band edge DOS are Nc = 2( hc2 b ) 2 and
10 16
2πm? k T
3
Nv = 2( hv2 b ) 2 . Then, at equilibrium, since all the electrons in the conduction band in an intrinsic semiconductor must have come from thermal excitation of the valence band electrons across the bandgap, their numbers must be equal: n = p = ni . The density ni is called the intrinsic carrier concentration, and is a characteristic property of any semiconductor (Fig. 15.1). This intrinsic density is then given by
10 12
10 8
10 4
n = Nc F 1 ( 2
=⇒ n2i = Nc Nv F 1 (
1 50
EF − Ec Ev − EF ) = Nv F 1 ( ) = p = ni , 2 kb T kb T
100
500
2
1000
Fig. 15.1 Intrinsic carrier concentrations ni vs. temperature for semiconductors with different effective masses and bandgaps. Note that at 500◦ C ∼ 773K, intrinsic wide bandgap semiconductors (e.g. GaN or SiC) of an energy gap 3 eV have very few thermally generated electrons and holes compared to silicon, making wide bandgap semiconductors suitable for high-temperature electronics.
1 This in turn has several implications on
their use: it is rather difficult to make a narrow bandgap semiconductor electrically insulating at ambient temperatures. Famously, graphene has good electron transport properties, but since it has a zero bandgap, it cannot be made insulating: back in Exercise 5.2 you have calculated that the 2D carrier density varies as T 2 , and cannot be lowered below ∼ 1011 /cm2 at room temperature. On the other hand, it is simpler to make a wide-bandgap semiconductor insulating, its conductivity can be varied over a very large range.
EF − Ec Ev − EF ) F1 ( ), 2 kb T kb T
(15.2)
which is a general relation for the intrinsic carrier concentration. Under non degenerate conditions, EF > Ev + k b T. This is true when Ev < EF < Ec , and the energy separation is several k b T; in other words, EF is deep in the bandgap of the semiconductor. In this situation, we get (see Fig. 15.2) n ≈ Nc e
E F − Ec kb T
Ev − E F kb T
,
Eg − 2k T b
.
, and p = Nv e
n = p = ni =⇒ ni ≈
√
Nc Nv e
(15.3)
Thus, the 3D intrinsic carrier concentration goes as ni ∼ T 3/2 e− Eg /2kb T , and is dependent only on the material bandstructure via the effective masses and the bandgap. As a simple example, if m?c = m?v = 0.1me and Eg = 1 eV, ni ≈ 1010 /cm3 at T = 300 K. A narrow bandgap semiconductor has very high intrinsic carrier concentration, and a widebandgap semiconductor has very few, as is expected intuitively1 . For GaAs, there is only one conduction band minimum with m?c ∼ 0.067me , thus the conduction band valley degeneracy is gvc = 1. The valence band degeneracy is also gvv = 1 for each of the light-hole, heavy hole, and split-off bands. Because there are three distinct valence hands: the light hole (LH) band, the heavy hole (HH) band, and the split-off (SO) band (see Section 12.5), there will be holes in each of them. The total hole density is the sum of the hole densities of each band: E LH − EF E HH − EF ESO − EF p = NvLH F 1 ( v ) + NvHH F 1 ( v ) + NvSO F 1 ( v ), 2 2 2 k T k T k T | {z b } | {z b } | {z b } p LH
p HH
pSO
(15.4) with the appropriate band-edges and effective masses. For example, if the LH and HH bands are degenerate at k = 0 or the Γ-point of the
15.1
Carrier statistics 313
Fig. 15.2 Density of states g( E), the Fermi function f ( E), and the occupied electron and hole densities for a 3D semiconductor, and a quantum well that hosts quasi-2D DOS. For intrinsic (undoped) semiconductors at thermal equilibrium, electrons are thermally ionized across the bandgap. Thus, the number of electrons in the conduction band n and the number of holes p in the valence band are the same, each equal to the intrinsic carrier concentration ni characteristic of the semiconductor.
Brillouin zone Ev = EvLH = EvHH , and the split-off band is at energy ∆ below E0 , the total hole density is given by p = 2(
3 3 2πk b T 3 Ev − EF ) 2 [(mvLH ) 2 + (mvHH ) 2 ] F 1 ( ) 2 2 kb T h 2πmSO Ev − ∆ − EF v k b T 32 +2( ) F1 ( ), 2 2 kb T h
(15.5)
which may be written as 2πmvav k b T 3 Ev − EF 2πmSO Ev − ∆ − EF v k b T 32 ) 2 F1 ( ) + 2( ) F1 ( ), 2 2 2 kb T kb T h h2 (15.6) 3 3 2 av LH HH 2 3 2 2 where mv = [(mv ) + (mv ) ] , which is sometimes referred to as a ”density of states effective mass”. Now for indirect bandgap semiconductors such as silicon and germanium, there are several equal-energy valleys in the conduction band, whose minima are not at the k = (0, 0, 0) or Γ- point. In silicon, there are gv = 6 minima, each located ∼ 0.8Γ − X points, for example at kmin ≈ (0.8 2π a , 0, 0), where a is the lattice constant. We label E(kmin ) = Ec as the conduction band edge, for silicon Ec − Ev = 1.1 eV. The energy eigenvalues of the conduction bandstructure written out for small values3 of (k x , k y , k z ) = k0 − kmin is given by p = 2(
h¯ 2 k2 k2 k2 E(k x , k y , k z ) = Ec + ( x + x + x ), 2 m1 m2 m3
(15.7)
where m1 , m2 , m3 are the effective masses along the k x , k y , and k z directions. For silicon, m1 = m L ≈ 0.98me is the longitudinal electron effective mass for states moving in the six equivalent Γ − X directions,
2 The nomenclature is varied and some-
times confusing, so the emphasis should be in grasping the physical meaning.
3 We are effectively shifting the origin of
the k-space to the conduction band minimum point, and keep the method general for all indirect bandgap semiconductors. For silicon, (k x , k y , k z ) = k0 − 0 0 0 kmin = (k x − 0.8 2π a , k y − 0, k z − 0).
314 Carrier Statistics and Energy Band Diagrams
and m2 = m3 = m T ≈ 0.19me is the electron transverse effective mass for carrier moving perpendicular to the Γ − X directions, as indicated in Fig. 15.3. Now the electron density in the conduction band is written simply by summing over the occupied states in the k-space: n = gs gv ∑ f ( k ) = gs gv k
Z
dk x dk y dk z (2π )3 q
1 2 k2 k2 k2 [ Ec + h¯2 ( mx + mx + mx )]− EF 2 3 1 kb T
,
1+e q q(15.8) 0 0 me me me m k x , ky = m2 k y , and k z = m3 k x
which the transformation k x = converts to a familiar form r Z dk0 dk0 dk0 m1 m2 m3 x y z n = gs gv (2π )3 m3e
1
1 h¯ 2 ((k0 )2 +(k 0 )2 +(k0 )2 )]− E [ Ec + 2m x y z F e kb T e
1+ 1 2πm av k b T 3/2 EF − Ec = gs gv ( ) F1 ( ), where m av = (m1 m2 m3 ) 3 . (15.9) 2 2 k T h b | {z } Nc
1
The resulting effective mass that enters Nc is m av = (m1 m2 m3 ) 3 and the appropriate valley degeneracy gv must be used. For example, for silicon gv = 6 for six equivalent kmin points along the Γ − X axes, and 1 m av = (m L m2T ) 3 . Similarly, for Ge the conduction band minima are in the Γ − L directions at the BZ edge, but only half of each ellipsoid falls 1 within the first BZ. This implies gv = 12 · 8 = 4 and m av = (m L m2T ) 3 , with m L = 1.6me and m T = 0.08me . The carrier densities in the bands can be found in this manner for any sort of bandstructure. Because in an intrinsic semiconductor all electrons in the conduction band are thermally ionized from the valence band, their densities must be equal: n = p. This relation uniquely determines the Fermi level EF . A more general relation for the Fermi level, which holds even when n 6= p is obtained directly from Equation 15.3: Fig. 15.3 Conduction band minima of silicon showing the gv = 6 valleys along the six Γ − X directions, and the origin 1 of the DOS effective mass (m L m2T ) 3 .
4 The closer the Fermi level is to a band,
the more mobile carriers in that band. Conversely, the Fermi level is closer to the band that has more mobile carriers.
EF =
1 k T n Nv ( Ec + Ev ) + b ln( · ). 2 2 Nc p
(15.10)
In the intrinsic semiconductor, n = p = ni , and if Nc ≈ Nv , the Fermi level is located close to the middle of the bandgap. This is seen in Fig. 15.4. To be more accurate, the Fermi level is located closer to the band that has a lower DOS4 . From Equation 15.10, if n >> p, the logarithmic term is positive and the Fermi level moves up from the mid-gap towards the conduction band edge Ec . If p >> n, the logarithmic term is negative and EF moves down from the midgap towards Ev . This is shown schematically in Fig. 15.4, with the corresponding changes in the electron and hole concentrations in the bands. At thermal equilibrium, np = n2i is still maintained, even when the Fermi level is not at the mid-gap energy, implying that even when say n >> p, we know precisely how few holes there are: it is p = n2i /n.
Carrier statistics 315
15.1
Fig. 15.4 Donor and acceptor doping in a 3D semiconductor. Doping with donor atoms moves the Fermi level EF close to the conduction band, and the semiconductor becomes n-type. Doping with acceptors moves EF close the valence band, making the semiconductor p-type.
Now the location of the Fermi level with respect to the band edges of the semiconductor is the single most powerful tool in the design of any semiconductor device, from diodes to transistors, from solar cells to LEDs and lasers. In the operation of each of these devices, the semiconductor is typically pulled out of thermal equilibrium by applying an external voltage, or due to incident photons. In this chapter, we discuss how we control the Fermi level of semiconductors at thermal equilibrium5 . The Fermi level of semiconductors at thermal equilibrium is controlled by chemical doping. In Chapter 13 we discussed the atomic picture of doping. Replacing an atom of an intrinsic semiconductor with one that has one extra valence electron leads to n-type doping. This happens if the electron energy eigenvalue of the extra state Ed is located close to the conduction band edge, whereupon the extra electron instead of being tied to the donor atom is thermally ionized into the band and becomes mobile. This is shown in the energy band diagram picture in Fig. 15.5. For this to happen effectively at room temperature, the donor energy level should be within ∼ k b T ≈ 26 meV below the conduction band edge. The exact complement of this form of n-doping is if the dopant atom has one less valence electron than the atom it substitutes: this acceptor dopant captures an electron from the filled valence band, thus populating the filled valence band with a mobile hole. The ionized donor atom is therefore +vely charged, and the ionized acceptor atom is -vely charged, as seen in Fig. 15.5. The Fermi level moves closer to the conduction band for donor doping, and closer to the valence band for acceptor doping, seen in Fig. 15.4. How do we now find the Fermi level for a general doping situation? The answer to this question is always the same: To find the Fermi level, write down the equation for charge neutrality. The Fermi level is located at the energy that exactly balances all negative and all
5 Under non-equilibrium situations, the
relation np = n2i changes to the form E
−E
np = n2i · exp Fnk T Fp , where EFn and b EFp are the electron and hole quasi Fermi levels. We will discuss and use this concept in later chapters when we discuss semiconductor devices.
316 Carrier Statistics and Energy Band Diagrams
Fig. 15.5 Donor doping populates the conduction band with mobile electrons, leaving behind a +ve bound donor ion. Acceptor doping populates the valence band with mobile holes, leaving behind a -ve bound acceptor ion.
positive charges. Compared to the mobile electrons n and mobile holes p that were the only charges in the intrinsic semiconductor, in a doped semiconductor we must now also account for the +vely charged donors and -vely charged acceptors. Out of ND density of donors, + − a fraction ND are ionized; similarly for NA acceptors we have NA ionized acceptors. The charge neutrality equation now reads − + n + NA = p + ND .
Fig. 15.6 On-site Coulomb repulsion, an electron-electron (or many-particle) interaction effect modifies the occupation statistics of dopants, making it different from the Fermi–Dirac distribution.
6 Since
a single electron feels no Coulomb repulsion due to itself, the factor U enters only if there are two electrons. This is the reason the energy is not 2( Ed + U ).
(15.11)
In this charge neutrality equation, choosing a semiconductor and its doping densities ND and NA fixes all terms of the equation, leading to a unique determination of EF . To find it, we must first know what + − fraction of dopants are ionized: i.e., what are ND /ND and NA /NA ? It is enticing to use the Fermi–Dirac distribution directly to find the occupation functions of the donor states Ed and acceptor states Ea , but alas, that is incorrect, as has been found experimentally. The reason is rather interesting: here we encounter a taste of a genuine many-particle effect which makes the Fermi–Dirac distribution invalid. Consider a donor state of energy Ed as shown in Fig. 15.6. Here are all the possible scenarios for this energy state: the extra electron it brought in may be ionized into the band, in which case the state itself is occupied with n = 0 electrons, with total energy 0. The electron may not be ionized, in which case the state is filled with n = 1 electron, with either up or down spin, with energy Ed for each case. A third possibility is an intriguing one: since the donor eigenvalue can hold two electrons of opposite spins, this forms another state, but with a twist. Recall from Chapter 14 that the electron envelope wave? function |C (r )|2 ∼ e−r/aB of the donor-bound electron state is not an extended Bloch function, but localized like a Hydrogen atom with an effective Bohr radius a?B that extends over ∼ 10s of lattice constants. What this means is that the second electron is physically at roughly a distance a?B from the first electron, and must experience a strong Coulomb repulsion. This repulsion energy U thus raises the total energy of the two electron state to6 2ED + U, where U is typically in the eV range, much larger than k b T. This strong electron-electron interaction violates the intrinsic assumption of non-interacting electrons used to derive the Fermi Dirac distribution in Chapter 4, Section 4.4. Thus,
15.1
we must re-derive the many-particle distribution function. It is as simple as the original derivation, and the result looks deceptively similar, but is not quite the same. The derivation of the distribution function uses the same recipe as used in Section 4.4: find all system energy states, find7 the partition function Z = ∑n e− β(En −nEF ) , and then the probability of each state ni is e− β(Eni −ni EF ) /Z. The values of n and the corresponding En are shown in Fig. 15.6. The average occupation function of the state is:
hni =
Carrier statistics 317
7 Here β = 1/k T. b
0 · e0 + 1 · e− β(Ed − EF ) + 1 · e− β(Ed − EF ) + 2 · e− β(2Ed +U −2EF ) e0 + e− β(Ed − EF ) + e− β(Ed − EF ) + e− β(2Ed +U −2EF ) 1 + e− β(Ed +U −EF ) = 1 + 12 e− β(Ed − EF ) + 12 e− β(Ed +U − EF ) 2 ≈ |{z} 1 + e β(Ed − EF ) U →0
2 ≈ . |{z} 2 + e β(Ed − EF )
U →∞
(15.12)
If the electron-electron Coulomb interaction energy is negligible, by taking the limit U → 0, the average occupation function is hni = 2 f ( Ed ), where f ( E) is the Fermi–Dirac distribution function. We expect this, because each eigenvalue can hold two electrons of opposite spin, the 2 here is just the spin degeneracy. In the opposite limit of strong electron-electron interaction, the occupation number of the state at thermal equilibrium is not a Fermi–Dirac distribution: this is easily seen by putting Ed = EF , which yields an unconventional occupation function8 of 2/3. For the non-interacting distribution, this number is 1, exactly half of the 2 maximum electrons. These two distribution functions are shown in Fig. 15.6. Since the donor atom is ionized and contributes an electron to the conduction band if its occupation is zero (it has lost its electron!), we obtain immediately the many-particle result of the ionized donor density, (and the ionized acceptor density based on identical arguments9 ): + ND = ND
1 1+2·e
EF − Ed kb T
N− , A = NA
1 1+2·e
Ea − E F kb T
,
(15.13)
which immediately leads to the master charge-neutrality equation Ev − EF ND E − Ec NA Nv F 1 ( )+ = Nc F 1 ( F )+ . EF − Ed Ea − E F 2 2 kb T k T | {z } 1 + 2 · e kb T | {z b } 1 + 2 · e kb T | {z } | {z } p n + ND
− NA
(15.14)
8 This is seen experimentally in what is
known as the 2/3 conductance quantization peak in ballistic transport, for the same many-particle interaction reason as discussed here.
9 For acceptor doping, the prefactor of 2
in the exponent in the denominator depends on the degeneracy of the valence band edge. If the LH and HH bands are degenerate at the Γ-point, then this factor is 4, and the ionization is 1 Ea − E F . 1+4· e k b T
− NA NA
=
318 Carrier Statistics and Energy Band Diagrams
10 21
10 17
10 17
10 16
10 13
10 15
10 9
10 14
10 5
10 13
10 0.0 0.2 0.4 0.6 0.8 1.0 1.2
10 12
2
4
6
8
10
Fig. 15.7 Carrier statistics in silicon.
10 20 10 19 10 18 10 17 10 16 10 15
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
10 20 10 19 10 18 10 17 10 16 10 15
Fig. 15.8 Carrier statistics in doped GaN: n-type and p-type. Note that because the acceptor ionization energy is larger than the donor ionization energy, the ptype hole density is lower than the ntype electron density for the same effective doping.
The solution of this equation yields the Fermi level EF at thermal equilibrium. Once the location of EF is known, the mobile carrier den+ − sities n, p and the ionized donor and acceptor densities ND and NA are all known, and the state of the semiconductor is uniquely determined. Fig. 15.7 shows a graphical solution to the charge neutrality equation for silicon, with doping densities ND = 1014 /cm3 and NA = 1010 /cm3 . The y-axis is the density, and the x-axis is the location of the Fermi level EF with respect to the valence band edge EV . Each term of the equation are indicated as dashed lines in the plot, and the solid lines are the left-hand side (LHS) and the right-hand side (RHS). They intersect slightly above EF ≈ 0.8 eV, which is the location of the Fermi level where LHS = RHS, and the charges are completely balanced. The corresponding electron density can be read off as n ≈ ND = 1014 /cm3 , and the hole density is p ≈ n2i /ND , below 105 /cm3 . The point of intersection is along the line n = Nc F1/2 [( EF − Ec )/k b T ], in the part that is still exponential, in the Maxwell-Boltzmann approximation n ≈ NC exp [−( Ec − EF )/k b T ]. If the doping ND was much higher, the point of intersection will be closer to the conduction band edge EF ≈ Ec , where the n-line starts curving slightly, which is the degenerate doping limit. If the acceptor density was higher than the donor density, the intersection point would have been on the left, on the line of p instead, making the semiconductor p-type. Fig. 15.8 shows the charge neutrality plot for GaN, which has a much larger bandgap than silicon. Because the acceptor ionization en-
15.1
ergy of GaN doped with Mg Ea ∼ 180 meV is deeper than the donor ionization energy Ed ∼ 20 meV for GaN doped with Si, the hole density for the same acceptor doping is lower. Fig. 15.9 shows how the Fermi level moves as a function of temperature, for various donor and acceptor doping densities for silicon, and GaN. At high temperatures, the Fermi level moves towards the middle of the gap, because the thermally generated interband carrier density increases. A wider bandgap semiconductor is more resistant to this movement of the Fermi level. If the doping densities are low, ND > 1, e−ψ ≈ 0, which is the case near the metal/semiconductor
15.4
p-n homojunctions 323
junction and most of the depletion region except near x ≈ xd . Then the solution is a quadratic function d2 ψ 1 x x ≈ 2 =⇒ ψ( x ) = a + b( ) + c( )2 L L dx2 LD D D
(15.22)
with the constants a, b, c determined by the boundary conditions. For the depletion region edge near x = xd , ψ > 1, and the band Ec ( x ) bends in a parabolic fashion. Near the edge of the depletion region, the band-bending ψ > ND , the depletion region thickness is W ≈ 2es Vbi /qND . We have ended up with the same depletion thickness as the Schottky diode (Equation 15.19)! This should not be surprising, because a heavily doped semiconductor is in many aspects similar to a metal: it can provide a lot of bound charge within a very short depletion thickness, and therefore does not let the electric field penetrate too deep. On the other hand a lightly doped semiconductor must deplete great distances to terminate, or provide electric field lines. This asymmetry of the depletion thicknesses and fields in a pn junction is a very useful feature that is exploited rather heavily in devices: making a junction one-sided is responsible for unidirectionality and gain in electronics
15.5
and photonics as discussed in later chapters. Finally, we note that if the doping is high on both sides of the junction, the depletion thickness is very narrow: this feature is exploited to make interband tunneling diodes, as will be discussed in Chapter 24. The energy band diagram is now obtained by integrating the electric field profile. From the discussion on Schottky diodes, it is evident that near the junction, the band bending is large, ψ( x ) >> 1 in the n-side (or the p-side). This is clear since the band bending is of the order of the bandgap, which is much larger than k b T. Thus, the band bending is parabolic near the junction, with the curvature of the band edges proportional to the ionized dopant densities on each side. Since − the depletion region in the p-side has negative charge NA , the curvature of the bands points downwards there, and in the n-side the + +ve ND creates an upward band curvature. This is a general feature of energy band diagrams: note that the positively charged donors of the metal/n-semicondudctor Schottky diode in Fig. 15.11 also create a positive curvature. From the energy band diagram of the pn homojunction in Fig. 15.12, we note that electrons in the conduction band on the n-side must surmount a potential energy barrier of height qVbi to move to the p-side. Since holes bubble up the bands, the holes in the valence band on the p-side face the exact equal potential barrier to move to the n-side. In later chapters, we will see how applying a voltage will either lower, or increase the barrier heights, injecting electrons through the depletion region into the p-side, where they will become minority carriers because their number is lower than holes in the p-side. Similarly, holes will be injected into the n-side, where they are minority carriers. For this reason, a pn junction is called a minority-carrier device. If we play our cards right, we can make electrons and holes recombine to produce light: this is the working principle of semiconductor light-emitting diodes, and lasers, discussed in later chapters of this book.
15.5 Heterojunctions In the homojunction, the potential barrier for electrons is the same as for holes. For several device applications, a very powerful tool can introduce unequal barrier heights for electrons and holes: this is achieved by using heterostructures. In Chapter 14 we discussed that semiconductors of unequal bandgaps can be realized in various quantized structures. The top three panels of Fig. 15.13 show the three possible band alignments at the heterojunction formed between two undoped semiconductors. These band alignments are possible by choosing appropriate semiconductors and growing them epitaxially. The conduction band offsets ∆Ec and the valence band offsets ∆Ev are locked by the choice of the semiconductors. Fig. 15.14 shows a range of bandgaps and band alignments available with 3D semiconductors, and Fig. 15.14 shows the 2D semiconductor counterpart. For 3D semi-
Heterojunctions 325
326 Carrier Statistics and Energy Band Diagrams
Fig. 15.13 Various types of band alignments (top three figures). Bottom two figures: an example of electrostatics and band-bending at an isotype n-n heterojunction.
conductors, epitaxial growth with low defect densities requires latticematching, or strained layers up to certain critical thicknesses can be realized as long as the semiconductors are almost lattice matched. For 2D materials that have weak van der Waals interlayer bonding, this stringent condition of lattice matching is partially relaxed. Among the alignments shown in Fig. 15.13, the Type I, or straddling band alignment is most common: it is most heavily used to form quantum wells in high speed transistors, and LEDs and lasers in the GaAs, InP, and GaN material families. In these structures, both electrons and holes would prefer to remain in the lowest bandgap region to lower their energy. The Type II, or staggered alignment is typically seen in Si/Ge heterostructures. This alignment separates electrons and holes: for example in the Figure, holes would prefer to be in semiconductor A on the left, and electrons in semiconductor B on the right. To an extent, this structure mimics a p-n junction energy band diagram even when it is undoped. The Type III, or broken gap alignment is seen for some combinations of semiconductors such as InAs/GaSb, etc. In such a heterojunction, the valence band electron wavefunctions on the left have an evanescent tail in the conduction band states on the right, and vice versa. A charge dipole is formed as a result naturally in such junctions. The bottom two panels show how the energy bands of a heterojunction evolve with doping. The example chosen is a n-n heterojunction of Type I band offset with the respective Fermi levels fixed by doping on each side. Following the same strategy as the Schottky and pn junction diodes, we conclude that semiconductor B on the right will lose electrons to semiconductor A on the left. Semiconductor B will thus form a depletion region, and become positively charged due to the ionized donors left behind, and the energy band there will curve upwards as shown. However, semiconductor A was already n-type, and now has more electrons than was there due to doping. This sort of situation is called accumulation of carriers. Semiconductor A near the junction is negatively charged not by bound ionized acceptors (because there are none), but by the mobile electrons dumped into them from semiconductor B. Because of the negative charge, no matter what the origin, the band must curve downward as shown. It is also possible that the conduction band edge of semiconductor A goes below the Fermi level. In that case, semiconductor A can have 3D electrons far from the junction, but bound 2D states near the heterojunction. If semiconductor A was initially undoped, then the charge transfer will create a 2D electron gas at the interface. This is called modulation doping, in which carriers are ”imported” from another region. We will discuss such doping and its uses in future chapters. It is worth pointing out that since the exact nomenclature of naming band offsets as Types I, II or III (or straddling, staggered or broken) may sometimes vary in the literature based on the energy alignments
15.6 Energy band diagrams: Poisson+Schr¨odinger 327 7
177
6
207
5
248
4
310
3
413
2
620
1
1240
0 -1 -2 -3 -4 -5 -6
0 3.0
3.5
4.0
4.5
5.0
5.5
6.0
6.5
-7 -8 -9 -10
Fig. 15.14 Bandgaps vs. lattice constants, and band alignments of 3D semiconductors.
or sometimes the band symmetries, it must be clearly defined every time to avoid confusion.
15.6 Energy band diagrams: ¨ Poisson+Schrodinger In this section, we show an example of the evaluation of the energy band-diagram of a realistic semiconductor heterostructure that serves to ilustrate all the concepts of this chapter. We will do this as a solved exercise. Consider the AlGaAs/GaAs quantum well semiconductor heterostructure shown in Fig. 15.16. Structures like this are used to make modulation doped field effect transistors (MODFETs), also called high electron-mobility transistors (HEMTs) that are used in cell phones and electronic communication systems. Except the gray region where the donor doping density is shown, all other regions are undoped, and the layer thicknesses are defined. The substrate is thick. The surface Schottky-barrier height is qφb = 0.6 eV. All following questions are to be answered for a z-axis running vertically from the metal on the top into the substrate layer at the bottom. For semiconductor properties, energy bandgaps, and energy band offsets, a good source is the website http://www.ioffe.ru/SVA/NSM/Semicond/. a) Using Gauss’s law, calculate by hand the net mobile electron sheet density in the conduction band in the GaAs quantum well ns in 1012 /cm2 units, the electric field in 106 V/cm units, and energy band diagrams in eV units by neglecting quantum effects completely. ˚ units. Sketch them as a function of the depth from the surface z in A We shall now compare this hand-calculation with a self-consistent Poisson–Schrodinger solution. For this problem, use a self-consistent ¨ Poisson–Schrodinger solver to calculate, plot, and answer the follow¨
Fig. 15.15 Bandgaps vs. lattice constants, and band alignments of 2D semiconductors compared to assorted 3D semiconductors.
328 Carrier Statistics and Energy Band Diagrams
Fig. 15.16 An AlGaAs/GaAs/AlGaAs quantum well heterostructure with a Schottky gate metal for Section 15.6.
14 The basic method of Gauss’s law in 1D
problems states that the electric field due to an infinitely large area sheet charge of charge density σ is E = 2eσs at all distances from it. For electric fields outside a sheet, one can use the integrated sheet charge of the volume. A corollary is that the electric field changes by a magnitude E2 − E1 = eσs across a sheet charge.
15 A
self-consistent 1D PoissonSchrodinger solver is available from ¨ https://www3.nd.edu/∼gsnider/.
16 Electric fields always point from the
positive to the negative charge.
ing: b) The energy band diagram. Plot the conduction and valence band edges Ec , Ev and the Fermi level EF in eV units as a function of the ˚ units. depth from the surface in A c) The mobile electron density n(z) in the conduction band as a function of the depth from the surface in 1018 /cm3 units as a function ˚ units. of the depth from the surface in A d) The electric field F (z) in 106 V/cm units as a function of the depth ˚ units. from the surface in A e) Plot the effective mass (or envelope) wavefunctions of the first five subbands in the GaAs quantum well as a function of the depth from ˚ units. the surface in A f) Explain the correlation between the electron density, the electric field, and the occupied energy subbands. g) Compare the exact solution with the hand-calculation of part (a). Which quantities from the hand-calculation are accurate, and which ones are not trustworthy? Solution:14 a) The dopants are in the Al0.3 Ga0.7 As layer which is a wider-bandgap semiconductor than GaAs. As a result, the lowest energy states for the electrons are not at the location of the dopants, but in the GaAs layer, and in the Schottky metal on the surface. All the dopant electrons end up either in the QW, or the metal. We will use this for the hand calculation. The total sheet density of dopants is σd = ND · td = 1013 /cm2 where td = 5 nm is the thickness and ND = 2.1019 /cm3 is the doping density, which is positive in sign when ionized. Assuming the sheet density in the metal is σm and in the GaAs QW is σqw . Both are negative in sign, though σm and σqw are the magnitudes. Charge neutrality requires σm + σqw = σd , implying we need one more equation to find σqw . This is found from the condition that the total voltage drop across the structure from the metal into the bulk of the semiconductor is equal to the built-in voltage. This voltage is the net area under the electric field that develops due to the transfer for charge between the metal and the semiconductor, given by the difference of work functions φs − φm . Though this problem can be solved more accurately by hand, since we are doing a numerical evaluation in the next parts15 , let’s do a lightning-fast estimation: assume all the dopants are scrunched up at the center of the doped layer a distance t1 = 4.5 nm from the interface with the Schottky metal. Further, assume that all the 2DEG electrons are scrunched up ∼ t3 ≈ 5 nm from the top Al0.3 Ga0.7 As/GaAs interface t2 ∼ 5.5 nm from the dopant sheet. The 2DEG sheet is thus t2 + t3 = 10.5 nm from the dopant sheet. The electric field above the dopant sheet E1 must point towards the metal, and below it E2 towards the QW16 . Using a built-in voltage of φb − 1q ∆Ec and Gauss’s law, we
15.6 Energy band diagrams: Poisson+Schr¨odinger 329 0.10 0.00 -0.10
0.5 0.0
0.10 0.00 -0.10
-0.5 -1.0
0.10 0.00 -0.10
-1.5
0.10 0.00 -0.10 0
50
100
150
200
250
300
350
0.10 0.00 -0.10 50
18
1.4 x10
250
1.2 1.0 0.8 0.6 0.4 0.2 0.0
100
200
300
100
200
300
0.2 0.0
-0.2 -0.4 -0.6 -0.8 -1.0 -1.2
-1.4
Fig. 15.17 Solution to Problem 15.6 produced using 1D Poisson simulator. Left, top: energy band diagram for part (b). Left, middle: mobile carrier density n(z) for part (c). Left, bottom: Electric field E(z) for part (d). Right: Effective mass-wavefunctions of the lowest 5 quantized states in the GaAs quantum well. Note that the actual electronconcentration n(z) of Left, middle is similar in shape to the ground-state wavefunction ψ1 (z), which is a strong indication that most of the mobile electrons in the quantum well are in the ground state. This is easily confirmed by finding that only one eigenvalue falls below the Fermi level, and all others are above it.
get
σm + σqw σd = ND · td [Charge Neutrality]
−| E1 | × t1 + | E2 | × (t2 + t3 ) = φs − φm ≈ φb − ∆Ec [Total Potential Drop] σqw σm E1 = and E2 = [Gauss’s Law]. es es (15.26)
Solving the charge neutrality and potential drop equations immediately yields the electric fields and charges, whose signs are obtained
330 Carrier Statistics and Energy Band Diagrams
by inspection: qNd td t2 + t3 φ − ∆Ec · + b =⇒ E1 = −1.25 MV/cm es t1 + t3 + t3 t1 + t2 + t3 es E1 qnm = σm = es E1 =⇒ nm = q t2 + t3 es (φb − ∆Ec ) nm = ND td ( )+ =⇒ nm ≈ 8.9 × 1012 /cm2 t1 + t2 + t3 q ( t1 + t2 + t3 ) qNd td t1 φ − ∆Ec | E2 | = · − b =⇒ E2 = +0.15 MV/cm es t1 + t3 + t3 t1 + t2 + t3 es E2 qnqw = σs = es E2 =⇒ nqw = q t1 es (φb − ∆Ec ) nqw = ND td ( )− =⇒ nqw ≈ 1.1 × 1012 /cm2 t1 + t2 + t3 q ( t1 + t2 + t3 ) (15.27)
| E1 | =
Compare the hand-calculated approximate fields and charges shown as a dashed line with the exact solutions as shown in Fig. 15.17: in spite of the very rough approximations, we are very close to the exact answers! Note that we could do much better even by hand, as we will see in later problems. b) - e): Refer to Fig. 15.17. f) The occupied electron subbands are those whose energy eigenvalues are below the Fermi level. Since only one eigenvalue falls below the Fermi level, that state is occupied with the highest density of electrons, and the net electron density shape also takes the form of the wavefunction squared of the ground state, as seen in Fig. 15.17, Left, middle and Right, bottom. g) The electron sheet densities and electric fields can be obtained reasonably accurately by hand. This is because they primarily depend on Gauss’s law and only weakly on the Schrodinger equation. If we want ¨ to know exactly how deep is the ground state in the quantum well or the shape of n(z), the hand calculation will not be the most accurate in quantum structures, though even then a decent approximation can be obtained by approximating the shape of the quantum well. An infinite-well approximation is a good starting point for calculations by hand.
15.7 Polarization-induced doping in heterostructures Semiconductors of the GaN, InN, and AlN family, and the ZnO and ZnMgO family have the hexagonal close packed (HCP) Bravais lattice. They have four atoms in the basis each (e.g. two Ga and two N for GaN), and are of the wurtzite crystal structure. Unlike the zinc-blende or diamond structure of most III-V and group IV semiconductors, the
15.8 Chapter summary section 331
wurtzite structure is called uniaxial, because it has a unique (0001) c-axis along which it has a broken inversion symmetry. To see this, imagine physically flipping a GaN unit cell upside down along the c-axis: the atomic and bonding arrangement has changed. It is analogous to a bar magnet – the north and south poles of a magnet are tied to the ends, and flipping it upside down changes the magnetic field direction. GaN has a Ga-polar, and a N-polar direction: the (0001) direction is Ga-polar, and the (0001¯ ) is the N-polar. This is not the case in GaAs or Si. Now because of the large electronegativity of the chemical bonds, the bonds are partially ionic, and the lack of a center of symmetry results in the formation of an internal electric field in the crystal, even without any dopants! This is called spontaneous polarization. The quantum mechanical explanation of this phenomena uses the Berry phase discussed in Chapter 9. Here, we discuss briefly the consequences of the presence of polarization in semiconductors. Fig. 15.18 shows that if the microscopic charge dipoles pi formed in partially ionic sp3 tetrahedral bonds do not exactly cancel, we have in every tetrahedron a charge dipole left over, resulting in a net macroscopic polarization P = ∑i pi /Ω, where Ω is the volume of the unit cell. In a GaN crystal, the tips and the tails of these dipoles are in the c-planes – though in the bulk these charges balance and cancel, this is impossible on the surface. At the surface the entire polarization charge appears as a bound sheet charge of denˆ On the top Ga-face of the crystal, the polarization sheet sity σπ = P · n. charge is −σπ , and an exactly equal and opposite charge density +σπ must appear at the other surface of the crystal, which is the N-face. Because of the appearance of these bound charges, there is an electric field Fπ = σπ /es set up in the bulk, which causes the bands to bend, as seen in the figure. At heterojunctions, the charge density that appears at the interface is the difference of the polarization fields across: ˆ As a result of these interface polarization charges, σπ = (P1 − P2 ) · n. the bands can bend in a way that mobile 2D electron or hole gases are induced. Since the valence band is a reservoir of electrons, the conduction band at the heterointerface can be filled with electrons pulled out of the valence band near the surface due to the polarization field. Similarly, the electrons from the valence band can be pushed out to create holes in the bulk. If these junctions are close to the surface, filled and empty surface states also participate in this form of polarizationinduced doping. Since this form of doping is created by electrostatics rather than thermal ionization, the mobile carrier densities generated are nearly temperature independent.
15.8 Chapter summary section In this chapter, we learned:
---------
Fig. 15.18 Energy band diagrams of polar heterostructures such as GaN/AlGaN heterjunctions. because of the presence of internal electric charges due to spontaneous and piezoelectric polarization, electric fields and band bending happens even in the absence of donor or acceptor doping.
332 Further reading
• The carrier statistics of an intrinsic semiconductor, and the mass action law np = n2i at thermal equilibrium. • How to relate the bandstructure of a semiconductor to its carrier statistics: for example, how to handle multiple conduction band valleys in determining the mobile electron densities. • How charge neutrality defines the location of the Fermi level in all semiconductors. • How donor and acceptor doping converts a semiconductor into an n-type or p-type conductor, and how to properly handle the many-particle aspect of donor and acceptor statistics. • How to draw energy band diagrams of metal-semiconductor Schottky junctions, pn junctions, and heterojunctions using the universal philosophy that the Fermi level is the same everywhere. The energy band diagram is drawn by tracking the flow of charge → find the electric field → draw the energy band diagram. • A rich range of energy band alignments for heterostructures of 3D and 2D semiconductors. • How electronic polarization can create unbalanced charges and fields without doping.
Further reading
Most books on semiconductor devices cover the topics covered in this chapter in sufficient detail, because of the central importance of the control of electrons and holes in device applications. The popular book Physics of Semiconductor Devices by Sze and Ng is a good reference, as are Device Electronics for Integrated Electronic Circuits by Muller, Kamins, and Chan and Fundamen-
tals of Modern VLSI Devices by Taur and Ning. The old and timeless classics are Shockley’s Electrons and Holes in Semiconductors, Spenke’s Electronic Semiconductors, and Physics and Technology of Semiconductor Devices by Grove (the founder of Intel!) – but good luck getting these books as they are chronically out of print!
Exercises 333
Exercises (15.1) The deep-acceptor problem and the 2014 Physics Nobel Prize
and the hole mobility is µ p ∼ 10 cm2 /V·s. (d) Do online research of the connection between the p-type doping problem of wide-bandgap semiconductors and the 2014 Physics Nobel Prize and write a short summary of what you find.
Fig. 15.19 The 2014 Physics Nobel Prize went to the researchers who solved the p-type doping problem in GaN. This work made quantum well blue LEDs and lasers, and LED lighting possible.
(a) Show that for a homogeneous semiconductor with electron mobility µn and hole mobility µ p , the lowest conductivity that can be achieved at thermal √ equilibrium is σmin = 2qni µn µ p , irrespective of the donor or acceptor doping. Magnesium is a relatively deep acceptor in the wide bandgap semiconductor GaN. The acceptor ionization energy is E A ∼ 160 meV. Consider a GaN sample (Eg = 3.4 eV, m?c ∼ 0.2m0 , m?v ∼ 1.4m0 ) doped with NA = 1018 /cm3 magnesium acceptors. In the process of doping this sample with magnesium, unintentional donors of density ND = 1014 /cm3 of donor ionization energy ED = 10 meV also incorporate into the semiconductor. − + (b) For T=300 K, plot the log of n, p, NA , ND , − + n + NA , and p + ND as a function of the Fermi level EF . Remember the Fermi level can be within the gap, or in the conduction or valence bands. So in your plot vary the values of EF from below Ev to above Ec . Indicate the donor and acceptor ionization energies and show in the plot where the real Fermi level at 300 K is. Explain.
(c) What are the densities and types of mobile carriers in the sample at 300 K? Is the sample n- or p-type? Find the conductivity of the sample at 300 K if the electron mobility is µn ∼ 1000 cm2 /V·s
(15.2) Doping a semiconductor above the Mott criteria In this problem, we will find the critical doping density of a semiconductor above which it’s electrical conductivity starts behaving like a metal. The signature of metallic conduction is when the mobile electron density becomes nearly independent of temperature. Consider GaAs, a 3D semiconductor crystal with a bandgap of Eg = 1.4 eV, a dielectric constant of es = 13e0 and a conduction band-edge effective mass of m?c = 0.067me . We dope the semiconductor with donor atoms to a 3D density Nd , and investigate the free electron density in the conduction band. (a) Which dopants atom(s) from the periodic table will you choose to make GaAs n-type? Which site of the crystal should the atom(s) of your choice sit to do the job? (b) Find the donor ionization energy based on the effective mass theory. (c) Assume a doping density of roughly Nd1 = 1017 /cm3 . Argue why at room temperature T = 300 K, the mobile electron density in the conduction band may be approximated to be roughly equal to the doping density n ≈ Nd1 . Under this assumption, find the Fermi level EF with respect to the conduction band edge Ec for the doping density Nd1 . (d) The mobile electron density is measured as a function of temperature from T ∼ 300 K to T ∼ 1 K. Qualitatively sketch how the electron density will change with the lowering of temperature. Also qualitatively sketch how will the electrical conductivity of the doped GaAs change in the same temperature range by including a rough dependence of electron mobility. (e) Find the effective Bohr radius a?B of a donor
334 Exercises electron state using the effective mass theory. This is a characteristic length scale over which the wavefunction of the electron due to the donor spreads in real space, with the center at the donor site. You should get an answer roughly equal to a?B ≈ 10 nm.
(15.4) Semiconductor heterojunctions and band offsets: know what you are talking about
(f) Now consider the Mott critical doping density Ndcr = ( a1? )3 . Explain why at this, and higher dopB ing densities, it is not necessary for the electron to be thermally ionized to the conduction band to become mobile in the crystal. How would the electron move in the crystal under this situation? (g) The above critical doping density is called the Mott criterion, up to a constant of order unity. If the doping density in GaAs is Nd2 > Ndcr , re-do your answer for part (d) by sketching qualitatively the dependence of the mobile electron density, and the electrical conductivity as a function of temperature. Explain why this is a characteristic of a metal, and heavy doping has effectively converted an insulator (or semiconductor) to a metal. (h) Based on the above, explain why narrow bandgap semiconductors are prone to become metallic at low levels of unintended donor impurities, meaning to ensure they have semiconducting properties, they must be much purer than their wider-bandgap counterparts. (15.3) The mystery of high p-type conductivity in AlN Here is a mystery you can help solve. Now aluminum nitride (AlN) is one of the widest bandgap semiconductors we know, with an energy gap of nearly 6.2 eV. The bandstructure is shown in Fig. 13.12 of the text. It is notoriously difficult to dope AlN n-type, and most people today consider impossible to dope p-type. If you can dope it p-type, it will be a major breakthrough both for photonic and electronic devices. In 2014 a graduate student in my group was investigating thin layers of AlN deposited on top of an insulating silicon crystal by Hall-effect measurements of the conductivity. He called my cellphone excitedly late one night saying that he had measured very high p-type hole concentration and hole mobility in the sample, and therefore very high ptype conductivity. Turns out he was correct, but alas, it was not exactly the breakthrough of AlN ptype doping we were dreaming about. What might have happened that gave such a high p-type conductivity for these AlN on silicon samples?
Fig. 15.20 Various heterojunctions before reaching equilibrium.
Exercises 335
+
+
+
+
+
+
+
+
Fig. 15.21 Herbert Kroemer, a pioneer in semiconductor heterojunction based design of transistors and lasers was awarded the 2000 Nobel Prize in Physics.
Here is Kroemers [see Fig 15.21] Lemma of Proven Ignorance: If, in discussing a semiconductor problem, you cannot draw an energy band diagram, this shows that you don’t know what you are talking about, with the corollary: If you can draw one, but don’t, then your audience won’t know what you are talking about. In this problem we make sure we don’t fall into this trap! In Fig. 15.20 (a-f), energy band diagrams of several different semiconductor heterojunctions are shown (one on the left side and the other on the right side) before the two materials are put together, with their individual Fermi levels indicated. In each case sketch the equilibrium energy band diagram when a heterojunction is formed between the two semiconductors. In each case indicate the depletion and/or accumulation and/or inversion regions that may exist in equilibrium on either side of the heterointerface. The alignment shown corresponds to the electron affinity rule as shown explicitly in part (a). All the labels are also shown in more detail in part (a), and which you can use for the other parts. (15.5) High electron mobility transistors (HEMTs) Consider the AlGaAs/GaAs HEMT structure in Fig. 15.22. The structure is grown by MBE. The thicknesses and doping of the layers are: the cap layer: tcap = 5 nm and ND = 7 × 1017 /cm3 , Al0.3 Ga0.7 As layer thickness t1 = 25 nm, after the gate-recess etch, AlGaAs thickness t2 = 17 nm, δ-doped layer of thickness = 1 nm and effective 3D doping = 3.5 × 1019 /cm3 , t3 = 5 nm, and the GaAs quantum well thickness t4 = 10 nm. Assume the surface barrier height is pinned at qΦs = 0.6 eV below the conduction band edge for both GaAs and AlGaAs.
Fig. 15.22 AlGaAs/GaAs δ-doped HEMT.
(a) Calculate the 2DEG sheet density in the GaAs QW below the gate. Draw the charge-field-band diagram along line B-B’ for finding the sheet density. Verify your calculated value with 1D Poisson–Schrodinger simulation of the charge-field¨ band diagram. Is there any quantum-confinement? How many quantum-confined states are formed in the GaAs QW? What are the eigenvalues?. (b) Calculate the 2DEG sheet density in the GaAs QW below the source-and drain-access regions. Draw the charge-field-band diagram along line A-A’ for finding the sheet density. Verify your calculated value with 1D Poisson simulation of the charge-field-band diagram. Comment on quantum confinement and eigenvalues. (c) What is the gate-capacitance Cg ? Verify your analytical result with a 1D Poisson–Schrodinger ¨ simulation of Cg vs. the applied gate voltage Vg . (d) Calculate the threshold voltage Vth for the HEMT. Verify your analytical result with a 1D Poisson–Schrodinger simulation of the sheet density ns ¨ vs. the applied gate voltage Vg ; plot this in both linear and log scales. (15.6) Energy band diagrams of a heterostructure bipolar transistor (HBT) Fig. 15.23 shows the energy band alignments of three semiconductors with respective dopings that will be put together in the same order as the figure into a heterostructure bipolar transistor (HBT). Draw the energy band diagram of the HBT at equilibrium.
336 Exercises agram of this structure at zero bias and at pinchoff. What is the effective Schottky-barrier height in these two cases? Do you expect the gate leakage of this diode to be different from the AlGaN/GaN structure? Why (not)?
Fig. 15.23 Band alignments in a heterostructure bipolar transistor.
(15.7) Polar III-nitride semiconductor heterostructures Consider an Al0.3 Ga0.7 N/GaN HEMT structure. Assume that the surface Schottky barrier height is qφs = 1.7 eV on AlGaN and qφs = 0.9 eV on GaN. The Al0.3 Ga0.7 N layer is coherently strained to the GaN lattice constant. (a) Calculate the net polarization sheet charge Qπ ( x ) at a strained Alx Ga1− x N/GaN heterojunction for barrier Al composition x. Use the value for x = 0.3 for the rest of the problem. (b) How does the mobile sheet charge at the AlGaN/GaN junction vary with the thickness tb of the AlGaN barrier? Plot the sheet charge ns for AlGaN thicknesses up to tb = 40 nm. (c) Plot the energy band diagram of an AlGaN/GaN HEMT with a tb = 30 nm AlGaN cap at zero gate bias, and at pinch-off. What is the pinchoff voltage? Verify your analytical calculation with a self-consistent 1D Poisson–Schrodinger solution. ¨ (d) Now, a tcap = 5 nm layer of GaN is added above the AlGaN barrier. Calculate and plot the band di-
(15.8) Thermodynamics: the great leveler In semiconductor device physics, the central tenet is that the Fermi level is the same everywhere in thermal equilibrium. We proved it in statement 15.17 of this chapter. Think deeper about what this means – because it is tied to the concept of thermal equilibrium itself. (a) The statement does not require that the flow of electrons across the junction be zero in any direction – just that it be equal in both directions such that the net flow is zero. This implies that based on the DOS of each side of the junction, there can be currents flowing in both directions. Sketch these currents for a few junctions and examine their dependence on the Fermi level, temperatures, and band alignments. (b) A more interesting question is: if the junction was out of equilibrium due to an applied voltage or optical illumination, and the perturbation is suddenly turned off, how long does it take for it to reach thermal equilibrium? Argue why the currents you have sketched above are responsible to return the system to thermal equilibrium, and their microscopic nature will determine the duration of the return. The discussion above highlights the following important fact. The final equilibrium state itself is independent of the nature of the microscopic details that take it to that state. But how the system gets into, and out of, the equilibrium state, is very much dependent on the microscopic details!
Controlling Electron Traffic in the k-Space This chapter marks a summary of Modules I and II, and the beginning of a new adventure: Modules III and IV, on quantum electronics and quantum photonics with semiconductors. The goal of this chapter is to serve as a roadmap by making a smooth transition from Modules I and II. This is achieved by first succinctly summarizing the physics of semiconductors developed in the first two modules. Then, the quantum electronics and photonics of Modules III and IV are also succinctly summarized. There are significant advantages to sampling the flavor of all the core concepts of the subject in one place. You will solidify the concepts you have learned in Modules I and II, and should come back and re-read this chapter after going through Modules III and IV to see the big picture.
• The first part of this chapter succinctly discusses the quantum physics of electron and hole statistics in the bands of semiconductors, the quantum mechanical transport of the electron and hole states in the bands, and optical transitions between bands. The unique point of view presented here is a unified picture exemplified by single expressions for the carrier statistics, transport, and optical transitions for electrons and holes in nanostructures of all dimensions – ranging from bulk 3D, to 2D quantum wells, to 1D quantum wires, and with a direct connection to electronic and photonic devices. • The second part of the chapter summarizes the core concepts of the physics, and electronic and photonic applications of semiconductors and nanostructure based devices such as diodes, transistors, solar cells, light emitting diodes, and lasers. • Drawing an analogy, Modules I and II tell us how to create the architecture of a city – but for electrons: with houses and parking lots carved out of bands and heterostructures, roads of electron bandstructure E(k) and speed limits of v g (k) = 1h¯ ∇k E(k ) for each k-lane. In Modules III and IV we learn the tools of traffic control: gates and one-way lanes, electron dynamics in response to voltages and light, and how to coax out useful semiconductor devices by combining these conceptual building blocks.
16 16.1 Electron energies in semiconductors 338 16.2 Semiconductor statistics
340
16.3 Ballistic transport in semiconductors 342 16.4 Ballistic transport in non-uniform potentials/tunneling 345 16.5 Scattering of electrons by phonons, defects and photons 347 16.6 The Boltzmann transport equation 351 16.7 Current flow with scattering: drift and diffusion 354 16.8 Explicit calculations of scattering rates and mobility 358 16.9 Semiconductor electron energies for photonics 361 16.10 The optical joint density of states ρ J (ν) 363 16.11 Occupation of electron states for photonics 366 16.12 Absorption, and emission: spontaneous and stimulated 367 16.13 Chapter summary section
371
Further reading
371
Exercises
371
338 Controlling Electron Traffic in the k-Space
16.1 Electron energies in semiconductors
1 This is expected since if the atoms were
very far apart before forming a crystal, the number of states at each allowed energy is simply equal to the number of atoms.
2 A crude classical analogy is a ball mov-
ing at a constant velocity in a flat plane will maintain its momentum. But if it rolls in a periodic potential, its velocity increases and decreases periodically: its momentum is not fixed.
As we have discussed in Chapters 1–15, electrons in free space are 2 2 allowed to have continuous values of energies E(k) = h¯2mke , where h¯ = h/2π is the reduced Planck’s constant, me the rest mass of an electron, and k = 2π/λ is the wavevector with λ the electron wavelength. h¯ k = λh = p is the momentum of the free–electron by the de Broglie relation of wave-particle duality. The dynamical properties of the electron are governed neither by Newton’s law of particles, nor Maxwell’s equations for waves, but by the Schrodinger equation, which has the ¨ wave-particle duality baked in from the get-go. A periodic potential V ( x + a) = V ( x ) in the real space of a crystal of lattice constant a, when included in the Schrodinger equation, is found ¨ 2 2 to split the continuous energy spectrum electron energies E(k) = h¯2mke into bands of energies Em (k), separated by energy gaps, shown in Fig. 16.1. The mth allowed energy band is labeled Em (k). The states of definite energy Em (k) have real-space wavefunctions given by ψk ( x ) = eikx uk ( x ) called Bloch functions, where uk ( x + a) = uk ( x ) is periodic in the lattice constant. Each band has exactly the same number of allowed electron states N as the number of unit cells in the real space1 . Each state of energy Em (k n ) in each band m is indexed by a unique k n = 2π L n with n = 0, ±1, ±2, where L = Na is the macroscopic size of the crystal. The k n lies in the first Brillouin zone, between − G/2 ≤ k ≤ + G/2, where G = 2π/a is the reciprocal lattice vector. The Pauli exclusion principle allows each state to be occupied by gs = 2 electrons of opposite spins. Thus, each band Em (k) can hold a maximum of 2N electrons. The energy bands have the property Em (k + G ) = Em (k), i.e., they are periodic in the k-space. The k of the band Em (k) has a different meaning from the wavevector of a free–electron. h¯ k is not the momentum of the electron of a unique energy Em (k ). States of definite energy (or energy eigenstates) Em (k) are not states of definite momentum in a periodic crystal2 . The Bloch states are a mixture of states of momenta h¯ (k ± G ), which is why h¯ k is referred to as the ”crystal momentum”. The group velocity of state k in energy band E(k) is v g = 1h¯ ∇k E(k). In response to an external force F, the crystal momentum changes according to F = h¯ dk dt ; the energies and velocities change to satisfy this relation. The two statements above are the most remarkable results of the quantum mechanics of electrons in crystals, and are by no means obvious. The states at the very bottom and the very top of any band Em (k) must then have zero group velocity. An empty band cannot carry a charge current because there are no electrons in them. A filled band also cannot carry a net charge current – but for a different reason. In a filled band, the current carried by a filled state k has an exactly equal partner, but opposite in sign at −k. The net current cancels and gives (−q) (−q) zero: J f illed = Ld ∑k f (k)v g (k ) = Ld ∑k v g (k) = 0 for a filled band,
16.1 Electron energies in semiconductors 339
Fig. 16.1 Allowed electron energies in a crystal is the bandstructure E(k ). The DOS depends on the dimensionality: the 1D, 2D, and 3D DOS are shown. The Fermi level EF is controlled by doping, electrostatically, or with light or heat. The band edges may be changed in real space also by doping, electrostatically, light or heat, leading to energy band diagrams.
where f (k) = 1 is the occupation probability of state k, −q is the electron charge, and Ld is the volume in d dimensions. An empty band can carry current only if electrons are put into it – by doping, electrostatically, optically, or thermally. A filled band can carry a net current only if electrons are removed from it – again, by doping, electrostatically, optically, or thermally. When electrons are removed from the top of a filled band, the net current is J=
(−q) (−q) (−q) f (k)v g (k) = 1 · v g (k) + ∑ ∑ ∑ [ f ( k ) − 1] · v g ( k ) d d L L Ld k (+q) = ∑[1 − f (k)]vg (k),(16.1) Ld k
implying the current is effectively carried by ”holes” or empty electron states, which behave in transport and electrostatically as positive mobile charges +q, where −q is the normal electron charge. Fig. 16.1 shows mobile electrons at the bottom of the conduction band, and mobile holes as empty states at the top of the valence band. A crystal is an intrinsic semiconductor if there is a few eV bandgap between a completely filled valence band Ev (k) and the next conduction band Ec (k) that is completely empty. At the very bottom of the conduction band, we can expand the energies as the parabola 2 2 Ec (k) ≈ Ec + h¯2mk? , where Ec (0) = Ec is the band-edge, or lowest enc ergy, and m?c is the conduction band edge effective mass. Similarly, for 2 2 the top of the valence band is approximated as Ev (k) = Ev − h¯2mk? , with v Ev (0) = Ev the valence band edge, and m?v the valence band effective
340 Controlling Electron Traffic in the k-Space
mass. Ec − Ev = Eg is the energy bandgap of the semiconductor. If the crystal momentum k at which the conduction band reaches the minimum and the valence band reaches its maximum are exactly the same, we have a direct bandgap semiconductor, and if they are not, it is an indirect bandgap semiconductor.
16.2 Semiconductor statistics We will build the map of semiconductor physics shown in Fig. A.1 in the Appendix. Some general results for semiconductor structures are now discussed, in which electrons and holes are allowed to move in d dimensions. d = 3 is the case in an ordinary 3D bulk semiconductor, d = 2 is in 2D semiconductor membranes or heterostructure quantum wells, d = 1 is in quantum wires, and d = 0 i sin quantum dots. The density of states near the edge of the parabolic conduction band in d-dimensions is given by gcd ( E) =
gs gv
(4π )
d 2
Γ( d2 )
(
2m?c h¯
2
d
) 2 ( E − Ec )
d −2 2
,
(16.2)
where gs = 2 is the spin-degeneracy for conventional situations, and gv is the valley degeneracy. The units are in 1/(eV · cmd ). The valley degeneracy of the conduction band is gv = 1 for most direct-bandgap semiconductors in 3D, but is gv = 6 for 3D silicon, and gv = 4 for 3D Germanium, and gv = 2 in single layer 2D graphene, BN, or transition metal dichalcogenide semiconductors. Γ(...) is the gamma function, √ with Γ(1) = 1, Γ(1/2) = π, and Γ ( n + 1) = nΓ(n). For d = 3 the √ DOS increases with energy E as E − Ec , for √ d = 2, the DOS is constant, and for d = 1, the DOS decreases as 1/ E − Ec . The dependence d on the conduction band edge effective mass is (m?c ) 2 , which means a conduction band with a heavier effective mass has a higher DOS in all dimensions. Exactly in the same way, the valence band DOS has m?v for d −2 the mass, and the energy dependence on dimensions is ( Ev − E) 2 , the same as for the conduction band, except the argument is Ev − E for obvious reasons at the top of the valence band. Fig. 16.1 shows the DOS of the entire bands, the approximations of Equation 16.2 are applicable near the band edges, where most carrier transport occurs. At equilibrium, the Fermi level EF determines the number of electrons in the conduction band nd and the number of holes pd in the valence band. On the other hand, at equilibrium, if we know either the number of electrons in the conduction band nd or holes pd in the valence band, the Fermi level EF is uniquely determined. There is a one-to-one correspondence between the densities and EF at equilibrium. The relation is obtained by using the Fermi–Dirac occupation function with the density of states: nd =
Z ∞ Ec
dE · gcd ( E) ·
1 1+e
Ec (k )− EF kb T
= Ncd · F d−2 ( 2
EF − Ec ). kb T
(16.3)
16.2 2πm? k T
Semiconductor statistics 341
d
The prefactor Ncd = gs gv ( hc2 b ) 2 is called the effective band-edge DOS, and has units of 1/cmd , where d is the dimension. For typical semiconductors, Nc1d ∼ 106 /cm, Nc2d ∼ 1012 /cm2 , and Nc1d ∼ 1018 /cm3 at T = 300 K. The dimensionless Fermi–Dirac integral of R∞ j order j is Fj (η ) = Γ( j1+1) 0 du 1+ueu−η , the values of which may be obtained from tables, by calling functions (in Matlab, Mathematica, Python, or other packages), or by direct integration. Two important limiting cases of the Fermi–Dirac integral are the (semi-classical) nondegenerate Maxwell–Boltzmann limit Fj (η ) ≈ eη for η > +1. The di-
mensionless argument is η = which is a measure of how far the Fermi-level EF is from the band edge Ec , measured in units of the thermal energy k b T. If for example, the Fermi level is at EF = Ec − 6k b T for a 3D semiconductor with Nc3d = 1018 /cm3 , η = EFk−TEc = −6, and b the electron density in the conduction band is n3d = Nc3d · F 1 (η ) ≈ 2
18
15 3 Nc3d · e−6 ≈ 10 400 ≈ 2.5 × 10 /cm . On the other hand, if for the same semiconductor EF = Ec + 3k b T, η = +3, and the electron density is
n3d = Nc3d · F 1 (η ) ≈ Nc3d · 2
1 +1
η2 Γ( 12 +2)
= 1018 ·
3 3√2 3 π 4
∼ 4 × 1018 /cm3 . Similar
arguments carry over to find the statistics and densities of holes in the valence band. In the absence of dopants and impurities, an intrinsic semiconductor is charge neutral. This implies if there are any electrons thermally or optically ionized from the valence to the conduction band3 , it must leave behind an exactly equal number of holes in the valence band. This sets nd = pd , or Ncd F d−2 ( EFk−TEc ) = Nvd F d−2 ( Evk−TEF ), which is only b b 2 2 possible for an unique EF . The charge neutrality condition fixes the location of the Fermi level EF . The product of the intrinsic carrier densities is
Eg EF − Ec Ev − EF − ) · F d −2 ( ) ≈ Ncd · Nvd · e kb T , 2 2 kb T kb T (16.4) where the approximation on the right holds only for non-degenerate carrier distributions, when EFk−TEc Ec ( x ) as indicated by the state of energy E1 in Fig. 16.5, Q( x ) = −k ( x )2 < 0, and the exponential R √ R ± du Q(u) ± i dx ·k( x ) , which e in the WKB wavefunction is of the form e is still oscillatory, like the plane wave situation. The electron density corresponding to the envelope function (not the total wavefunction8 ) is then given by the probability density n( x ) ≈ |C ( x )|2 = |K |2 /|k( x )|, from where it may be seen that for regions where k ( x ) is large, n( x ) is small. The analogy to classical mechanics is the following: a particle spends less time in a region where it is moving fast, i.e., where its kinetic energy is large. The approximate current density J ≈ qnv ≈ q(|K |2 /k( x )) · (h¯ k( x )/m?c ) ≈ q¯h|K |2 /m?c ensures the current carried by the wavepacket is continuous in space. When an electron of energy E lower than the barrier height is incident on a barrier as indicated by the state of energy E2 in Fig. 16.5, E < Ec ( x ) on the right side of the region, and Q( x ) =R κ ( x )2 > 0. In this case, the WKB envelope function C ( x ) ≈ √ K e± dx·κ ( x) , which κ (x)
8 The periodic part of the Bloch func-
tion always remains as a constant background oscillation, which does not change the arguments. uk ( x ) is similar to the individual compartments of a long train, and C ( x ) is the train when looked from far – say from an airplane flying at a great height above the train, when the individual compartments are not seen. By looking at C ( x ), we are looking at the transport of the entire train, or the entire electron wavepacket.
now is a exponentially decaying, or growing wavefunction amplitude, quite unlike the oscillatory function seen for E > Ec ( x ). For a thick potential barrier, the electron wave is reflected back after a small, yet finite incursion into the classically forbidden region where E2 < Ec . If on the other hand the barrier was of a finite thickness between x1 < x < x2 , for propagation of the wavepacket from x1 to x2 the WKB transmission probability through the barrier is TWKB
C ( x2 ) 2 −2 ≈| | ≈e C ( x1 )
Rx
2 x1
dx ·κ ( x )
=e
−2
Rx
2 x1
dx ·
r
2m? c h¯ 2
[ Ec ( x )− E]
.
(16.14)
16.5 Scattering of electrons by phonons, defects and photons 347
As an example, consider a spatially uniform potential barrier V0 − E, and the total barrier thickness to be x2 −q x1 = tb . Using them in Equa-
mc V0 − E b tion 24.10, we get TWKB ≈ exp [− 0.1tnm me · 1 eV ]. For V0 − E = 1 ? eV, mc = me and tb = 3 nm, the tunneling probability is 1/e30 ≈ 10−13 . If the barrier thickness decreases by three times to tb = 1 nm, the tunneling probability increases substantially to 1/e10 ≈ 5 × 10−5 . This trick is used in several semiconductor device applications such as in ohmic contacts, and in resonant tunneling diodes to boost the tunneling currents. Chapter 24 goes deeper into the topics on transmission and tunneling for which a flavor was provided here. ?
16.5 Scattering of electrons by phonons, defects and photons The ballistic picture of transport assumes that the distribution functions f (k ) of the electrons in the ideal Bloch bands En (k ) are at the mercy of the contacts to the electron states in the bands. Because of this, the occupation function f (k ) of the band electrons shares the same Fermi-level as the contact(s) with which the band can freely exchange electrons. The hidden assumption in this picture is that the presence of non-idealities in the crystal, or perturbations arising in the surroundings of the crystal do not mix the electrons between different k-states against the wishes of the contacts. This is a good approximation in the smallest devices. But it is possible for the physical processes that are not considered in the Bloch bandstructure to wrest part of the control of the occupation function of the electron states from the contacts. These processes are schematically shown in Fig. 16.6. Each of them can be useful, or may create losses. Understanding their nature is critical to coax the desired function from semiconductor devices. In fact each of the processes can be put to use in some form of useful semiconductor device as long as the mechanism is understood. The processes divide crudely into two types. The first type of processes cause large changes in the energy of the electrons, thereby enabling transitions from one band to another across the bandgap, called interband scattering processes. Radiative or photonic transitions, and non-radiative transitions belong to the interband category. The second type of scattering causes minor changes in the energy (inelastic scattering), or no change in the energy (elastic scattering) of the electrons. They leave them within the same band and are called intraband scattering processes. Intraband scattering processes change the momentum of electrons, thereby mixing various k-states, and lead to electrical resistivity. Phonon and defect scattering belong to the intraband scattering process9 . Using Fig. 16.6, we discuss these processes as an appetizer for the more detailed treatment in the following chapters. Photons [Module IV]: Consider a semiconductor crystal on which photons or light is incident. In Chapters 26 and 27 of Module IV
9 A defect that is deep in the bandgap
can trap, or bind an electron removing it from the Bloch band energy spectrum and localizing it. We saw an example of this in Chapter 13, Section 13.4. If the deep level interacts with only one band, it captures and releases free carriers, causing fluctuations, or noise in the band current. If a deep level can interact with both bands, it can cause interband recombination. In several following chapters, we will find that processes mediated by deep levels play critical roles in the operation of semiconductor devices.
348 Controlling Electron Traffic in the k-Space
+
+
Fig. 16.6 Schematic representation of various scattering processes that mix electrons states within bands, and between bands of a semiconductor. On the left are shown interband processes that move electrons from one band to another: the absorption of a photon moves an electron from the valence band to the conduction band, and the emission of a photon causes the reverse process. Absorption therefore creates, or generates an electron in the conduction band and a hole in the valence band, and is therefore labeled as a Generation or G-process creating an electron-hole pair. Similarly, emission of a photon can occur when an electron drops from the conduction band to the valence band, leading to the annihilation of an electron-hole pair, and is referred to as a Recombination or R-process. The recombination/generation (R-G) processes may be radiative, or mediated by deep-level states accompanied by emission of phonons, in which case they are called non-radiative, or commonly referred to as Shockley–Read–Hall (SRH) processes. As opposed to the interband R-G processes, scattering mechanisms called intraband processes are shown on the right. These processes cause small or no changes to the energy of electrons and leave them in the same band. These mechanisms include scattering of electrons by phonons, and scattering by defects and impurities. Note the typical time scales of the various scattering mechanisms: ranging from ∼ps for intraband processes to ∼ns for radiative processes, and ∼ms-seconds for non-radiative processes.
we review that photons in a light wave are the quanta of electromagnetic excitations of the vacuum. The electric field profile of a light wave is an oscillation in space and time (spacetime) of the form E( x, t) = 2E0 cos(kx − ωt) = E0 [ei(kx−ωt) + e−i(kx−ωt) ]/2, where E0 is the amplitude of the electric field. The circular frequency ω and the photon wavevector k = 2π/λ are related by ω = ck where c is the speed of light and λ the wavelength. The electric potential seen by the electronsR in the crystal is the integral of the electric field profile W ( x, t) = dx · E( x, t). If the light wavelength is of the order of µm, which is much larger than the lattice constant and electron wavelengths in the crystal, the light wave may be approximated as E( x, t) ≈ E0 e±iωt (this is called the dipole approximation), and the perturbation seen by the electrons in the semiconductor is W ( x, t) = R q dx · E( x, t) ≈ qE0 xe±iωt = W0 ( x )e±iωt , where the perturbation is split into a spatial part W0 ( x ) and a time-dependent part e±iωt . This potential is experienced by the electrons, in addition to the perfect periodic potential Vper ( x ) of the crystal, so the total potential seen by the electrons is Vper ( x ) + W ( x, t). The periodic potential Vper ( x ) creates the band states, and the perturbation due to light W ( x, t) causes
16.5 Scattering of electrons by phonons, defects and photons 349
transitions between them, both upwards for absorption, and downwards for emission as indicated in Fig. 16.6. Module IV of the book is dedicated to the study of these photonic interband processes, which typically occur at ∼ns timescales and the devices that result from them such as light emitting diodes, solar cells, and lasers. Phonons [Chapter 22]: Even when light is not incident on a semiconductor, the atoms in the crystal do not sit still, they vibrate around their equilibrium periodic positions10 . In Chapter 22 we will find that the amplitude of their vibrations in space increases with the lattice temperature T. At a high temperature when the vibration amplitude approaches the interatomic distance, the crystal melts. At room temperature, the vibrations from the mean positions are not too large, and can be broken down into quanta of orthogonal lattice wave modes called phonons. The displacement of an atom at location x at time t due to a single phonon is written as u( x, t) = u0 ei(qx−ωt) . This phonon has an amplitude u0 , a wavevector q = 2π/λ, and frequency ω. In crystalline solids, ω and q have two relations: acoustic, and optical. For acoustic modes, ω = vs q where vs is the sound velocity, and for optical modes, ω ≈ ω0 . The perturbation to the periodic crystal potential Vper ( x ) due to the phonon is treated as a small time-dependent change due to the dilation or compression of the lattice. This is indicated as a greatly exaggerated change to the energy band diagram in Fig. 16.6, where regions of compressive strain have a larger bandgap and dilated regions have a smaller bandgap. A deformation potential Dc is used to measure the change in the electronic potential experidu( x,t) enced by the electron: W ( x, t) = Dc dx for a longitudinal acoustic wave. For such a wave, W ( x, t) = iqu0 ei(qx−ωt) = W0 ( x )e±iωt . Note that for both phonons and photons, the perturbation to the periodic potential is of the form W0 ( x )e±iωt , and is explicitly time-dependent. The +/- signs in the exponent of the time dependent part of the perturbation e±iωt lead to the emission and absorption process for both photons and phonons. Defects [Chapter 23]: Let us say that the crystal is sitting in the dark, and at a very low temperature. Even then, if there are defects in the crystal, they will introduce perturbations to the periodic potential, which remain active also at finite temperatures. Let us say each defect has a potential W0 ( x ), and there are identical defects at various locations x1 , x2 , x3 , .... Then, the total potential seen by the electron is Vper ( x ) + Wtot ( x ), where Wtot ( x ) = W0 ( x − x1 ) + W0 ( x − x2 ) + W0 ( x − x3 ) + .... Note that the potential of these defects is time-independent11 , unlike those of phonons and photons. Such defect potentials may be considered to be time-dependent, but with ω = 0. Fermi’s golden rule for transitions [Chapter 20]: The electrons in the semiconductor crystal are subject to the above variety of perturbation potentials of the form W ( x, t). The time-dependent Schrodinger ¨ equation, and the resulting scattering rate of states due to the pertur-
10 Imagine we had a very fast camera,
and took a snapshot of the crystal. We will then discover that the atoms are actually never exactly periodic! So does the entire buildup of the Bloch theory of the electron bandstructure of semiconductors collapse? Luckily, it does not: the phonon energies are typically much smaller than the energy bandgaps, and for most cases it is safe to assume the phonons as small perturbations to the perfectly periodic crystal potential. This is the reason why even though at any given time instant the atomic locations are not exactly periodic, we can still observe a strong X-ray diffraction peak as discussed in Chapter 10, and in Exercise 10.6 where the concept of the structure factor, and the Debye–Waller factor was introduced.
11 The defects that fall in this category
of scatterers include ionized donors and acceptors, point defects such as charged or uncharged vacancies and interstitials, linear defects such as dislocations, and planar defects such as grain boundaries. The scattering potentials due to such defects do not change the energy of the electron states in a scattering event, but changes the momentum.
350 Controlling Electron Traffic in the k-Space
bation derived in Chapter 20 is given by i¯h
=⇒
Fig. 16.7 Electronic transitions immediately following the absorption of a photon in a direct gap semiconductor, and an indirect bandgap semiconductor. The direct gap semiconductor results in a radiative transition, whereas the indirect bandgap semiconductor results in a non-radiative transition. The competition between radiative times τr and non-radiative times τnr in a semiconductor defines it’s internal quantum effi1/τr ciency IQE = (1/τ )+( . If τr 0 there is no light, there is no more generation, implying Gn = 0. Excess electrons will be lost by recombination at the rate Rn = (n − n0 )/τn , where τn is the recombination rate. The continuity equation then becomes ∂ ( n − n0 ) d2 ( n − n0 ) d ( n − n0 ) ( n − n0 ) = Dn + µn F − 2 ∂t dx τn dx
=⇒ n( x, t) − n0 = √
N ( x − µn Ft)2 t · exp − − , (16.27) 4Dn t τn 4πDn t
where the solution of the partial differential equation takes the form of a Gaussian distribution, whose mean position drifts with √ the field with drift velocity v = µn F and spread diffuses over length Dn t in time t, while losing the particle density as √ e−t/τn . The coefficient indicates that the 1D density decreases as N/ 4πDn t; in d-dimensions, the coefficient is N/(4πDn t)d/2 . This drift and diffusion phenomena is illustrated in Fig. 16.11. Because both diffusion and drift processes are linked to the exact same microscopic scattering mechanisms, we can expect the drift current Jdri f t = qnµn E and diffusion current Jdi f f = qDn dn/dx to be
Fig. 16.11 Diffusion, drift, and recombination of excess electrons after they are generated by a pulse of light in a semiconductor. This behavior was verified by the Haynes-Shockley experiment, a classic in semiconductor device physics. This measurement enables a measurement of the diffusion constant Dn and the mobility µn simultaneously, and helps verify the Einstein relation Dn /µn = k b T/q derived in Equation 16.28.
358 Controlling Electron Traffic in the k-Space
linked as well. Consider a semiconductor that has an intrinsic concentration gradient due to say a varying doping in space. In that case, as carriers diffuse, they set up an internal electric field, and at equilibrium the net current must be zero. The electron density depends on the local potential as n( x ) = Nc ( x )e−(Ec ( x)− EF )/kb T = Nc · e+qV ( x)/kb T , where we write Ec ( x ) − EF = −qV ( x ). Though Nc ( x ) can vary in space, it is much slower than the exponential term, which is why we can assume it to be nearly constant. Thus we can write J = Jdri f t + Jdi f f = qnµn (− qV
dV dn Dn dV ) + qDn = 0 =⇒ = dx dx µn dn/n
n = Nc e kb T =⇒
dV k T = b =⇒ dn/n q
Dn k T = b . µn q (16.28)
The boxed part is the Einstein relation, which highlights the deep connection between the diffusion and drift processes. Returning to drift transport, by relating the drift charge current to a g g Drude form Jdri f t = σE = qnµE, where n = Ls dv ∑k f k , we identify the electron drift mobility as Jdri f t µ= = qnE
q2 gs gv Ld
∂f0
· [∑k v2k τm (k)(− ∂E(kk) )] · E q·[
gs gv Ld
∑k f k ] · E
∂f0
= q·
∑k v2k τm (k )(− ∂E(kk) ) ∑k f k0
.
(16.29) The numerator has the derivative of the Fermi–Dirac distribution with energy. This means if we have a degenerate Fermi gas when say EF > Ec , for most of the electrons in the window Ec ≤ E(k ) ≤ ∂f0
EF , the term − ∂E(kk) ≈ 0, and they cannot carry net currents. Their contribution to the ensemble mobility is low: this is Pauli blocking ∂f0
14 We will build upon this brief introduc-
tion of transport phenomena in Chapters 21, 22, and 23. The basic introduction of the drift and diffusion processed in this chapter will enable us to treat various phenomena that occur in devices such as Schottky and p-n junction diodes in Chapter 18 without worrying about all the microscopic details.
in action. Because − ∂E(kk) peaks at the Fermi level, the contribution to the mobility will be dominated by states near the Fermi level E ≈ EF , or in other words, the conductivity is dominated by states near the Fermi level. If we consider the low temperature approximation ∂f0
− ∂E(kk) ≈ δ( E − EF ) in the degenerate condition for parabolic bands, qτ (k )
we can obtain µ ≈ mm? F , where k F is the Fermi wavevector. This c expression of the mobility is in the Drude form. For the more general situation, the summation over the k states must be performed, which we discuss in the next section14 .
16.8 Explicit calculations of scattering rates and mobility We now succinctly outline the explicit calculation of the drift mobility due to scattering between effective mass wavepacket states centered at states |ki and |k0 i within the same band, with corresponding
16.8 Explicit calculations of scattering rates and mobility 359
time-dependent wavefunctions φk (r, t) ≈ Ck (r, t)uk (r) and φk0 (r, t) ≈ Ck0 (r, t)uk0 (r). Consider a scattering potential of the form W (r, t) = W0 (r)e±iωt . The time-dependent effective mass equation for the electron wavepacket centered at k is i¯h
∂Ck (r, t) = [ Ec (−i ∇) + W (r, t)]Ck (r, t) ∂t h¯ 2 = [− ? ∇2 + Ec (r) + W (r, t)]Ck (r, t). 2mc
(16.30)
The lattice periodic part uk (r) cancels in this equation, just as in the time-independent version. In the absence of the perturbation, the envelope functions of the states are Ck (r) and Ck0 (r). We can write the momentum scattering rate using Fermi’s golden rule for the envelope function as 1 2π = |hCk0 (r)|W0 (r )|Ck (r)i|2 δ[ E(k0 ) − ( E(k) ± h¯ ω )] τ (k → k0 ) h¯
=⇒
1 = τm (k)
1
∑0 τ (k → k0 ) (1 − cos θ ), k
(16.31)
k0 ·k . |k||k0 |
where cos θ = It is possible to directly use the envelope function in the golden rule for transitions within the same band, for small changes in k, which ensures that the periodic part of the Bloch functions uk (r) are the same. For such transitions, the time-dependent effective mass equation Equation 16.30 has an identical mathematical form as the exact time-dependent equation for the Bloch functions in Equation 16.15. Whenever this condition fails, one must resort back to using the functions Ck (r)uk (r) and Ck0 (r)uk0 (r), where the lattice-periodic parts are explicitly included15 . For transitions within the same band and for small changes in k, consider the case of scattering by a single positively charged ionized impurity (say an ionized donor atom) in a semiconductor crystal. Fixing the origin of coordinates at the location of the impurity, the screened − r q2 scattering potential16 is W (r, t) = W0 (r )e±iωt = W0 (r ) = − 4πes r e LD , q where es is the dielectric constant of the semiconductor, L D = eqs k2bnT is the Debye screening length with n the free–electron concentration, and ω = 0, indicating the scattering potential is time-independent. 0 The envelope functions are Ck (r) = √1 eik·r and Ck0 (r) = √1 eik ·r V V where V is the macroscopic volume of the semiconductor crystal. The scattering matrix element is
hCk0 (r)|W0 (r )|Ck (r)i =
Z
q2
0 1 d3 r ( √ e−ik ·r ) · (− e 4πes r V q2 = es V
− Lr D
1 L2D
1 ) · ( √ eik·r ) V 1 , + | k − k 0 |2
15 Such situations arise when there are
transitions between bands, say from the conduction to the valence band in interband transitions involving tunneling or optical optical processes, or in avalanche multiplication at high electric fields. The periodic part of the Bloch functions must also be included in intervalley scattering processes when a large part of the Brillouin zone is sampled. This may happen in transitions from the Γ-point with |si orbital lattice-periodic states, to a | pi state in the same band, but at the other extrema, say near the Brillouin zone edge. 16 The form of the screened Coulomb po-
tential is identical to the Yukawa potential that keeps protons from flying apart in a nucleus. Hideki Yukawa proposed this potential in 1935. The particles that mediate the nuclear interaction called mesons were discovered experimentally, earning him the Nobel Prize in 1949.
(16.32)
360 Controlling Electron Traffic in the k-Space
where the integral is evaluated in 3D spherical coordinates over the entire space by aligning the vector k − k0 at an angle θ to the z-axis. Note that the scattering matrix element between plane-wave like envelope states is equal to the Fourier transform of the scattering potential. Now the total momentum scattering rate of state k is 1 = τm (k)
2π q2 ·| h¯ es V
1 |2 · δ[ E(k0 ) − ( E(k) ± h¯ ω )] · (1 − cos θ ), 0 |2 + | k − k k (16.33) which upon summing over all 3D k0 states, using the Dirac delta function, and the considering the effect of a total of Nimp impurities results in
∑0
1 L2D
4k2 L2D 1 gs gv q4 m?c Nimp 1 = · · · [ln(1 + 4k2 L2D ) − ], 2 3 τm (k) V } (h¯ k ) 8πes V | {z 1 + 4k2 L2D nimp
Fig. 16.12 Momentum scattering rate due to ionized impurities as a function of electron energy, and electron mobility as a function of temperature due to ionized impurity scattering and acoustic phonon scattering in three dimensions.
17 The mobility due to ionized impu-
rity scattering can be read off from the dimensionless form of Equation 16.36. For example for a semiconductor of conduction band effective mass m?c = 0.2me , er = 10, and impurity density = free–electron concentration (nimp = n = 1017 /cm3 , meaning the semiconductor is √ uncompensated), β = 83.5/ 10 = 26.4, and F ( β) = 5.55. The mobility at room temperature then is µimp ≈ 383082 5.55 · 1 · 1 2 10 ≈ 6902 cm /V·s. This is the mobility if only ionized impurity scattering was active, meaning phonon and other scattering mechanisms are not considered. Similar expressions for other scattering mechanisms will be discussed in Chapter 23.
(16.34) where k = |k| is the length of the wavevector, and nimp = Nimp /V is the volume density of the uncorrelated scatterers. The dimensions may be checked to be 1/sec. The term in the square brackets is a slowly varying function. The momentum scattering rate can now be used in Equation 16.29 to get the electron mobility explicitly. For a 2 2 parabolic bandstructure E(k) = h¯2mk? assuming a reference Ec = 0, the c momentum scattering rate as a function of the electron kinetic energy is 8m? E
c L2D 1 gs gv q4 m?c 1 8m?c E 2 h¯ 2 = · n · · [ ln ( 1 + L ) − ], ? imp D 3 τm ( E) 8πes2 h¯ 2 1 + 8m2c E L2D (2m?c E) 2 h¯ (16.35) which indicates that the scattering rate decreases as the kinetic energy of electrons increases, and increases when the impurity density increases. These trends are indicated in Fig. 16.12. For a nondegenerately doped semiconductor, using Maxwell–Boltzmann approx-
− E(k)
imation of the equilibrium distribution function f k0 ≈ e kb T , the electron mobility due to ionized impurity scattering is obtained to be 7
µimp µimp ≈
3
3
2 2 (4πes )2 (k b T ) 2 T2 ≈ 3 √ ∼ n 3 ? imp π 2 q mc nimp F ( β)
er 2 ) ( T K )3/2 cm2 383082 ( 10 · q ? · 300nimp · , F ( β) m ( 1016 cm3 ) V · s
(16.36)
0.2me
q ? β2 2mc (3k b T ) where F ( β) = ln[1 + β2 ] − 1+ β2 , and β = 2 L D ≈ 48.2 × h¯ 2 q q q ? 16 − 3 er T m 10 cm are dimensionless parameters17 . As 10 · ( 300 K ) · 0.2me · n the impurity density increases, the mobility decreases due to more frequent scattering. The mobility increases with temperature because an
16.9
Semiconductor electron energies for photonics 361
energetic electron is less perturbed by the scattering potential from its original path. The mobility limited by ionized impurity scattering is high for semiconductors with small effective masses, and for large dielectric constants. For non-degenerate electron concentrations, the ionized impurity scattering limited mobility increases with temperature as T +3/2 as indicated in Fig. 16.12. When the scattering rate and mobility limited by acoustic phonon scattering is calculated in a similar manner, the dependence is found to be µ ac ∼ T −3/2 . Since all scattering mechanisms act in parallel, the rates add, and the net mobility is due to the dominant scatterers. At low temperatures, phonons are frozen out, and defects and impurity scattering dominate. At high temperatures in sufficiently defect free samples, phonon scattering dominates. We will evaluate the scattering rates and mobilities of various processes in more detail in Chapter 23.
16.9 Semiconductor electron energies for photonics In the remainder of this chapter, we selectively sample the contents of Module IV, Chapters 26-29 on quantum photonics with semiconductors. A photon interacts with a semiconductor for the most part in an identical fashion as it does with an atom. Because the semiconductor is made of a large number of atoms, the primary difference is a larger number of allowed transitions compared to the atom. There are a few new physical phenomena that emerge in a semiconductor crystal and cannot happen for an individual atom, but knowing the photon-atom interaction physics goes a long way in explaining the photonic properties of semiconductors. Fig. 16.13 and its caption discusses the interaction of photons with atoms and with semiconductors in a comparative fashion. Let the semiconductor bandgap be Ec − Ev = Eg , where Ec is the conduction band minimum, and Ev is the valence band maximum. Let the conduction band effective mass be m?c and the valence band effective mass m?v , so that the conduction band states are given by h¯ 2 k2
h¯ 2 k2
Ec (k) = Ec + 2m?c and the valence band states are Ev (k) = Ev − 2m?v . c v This is indicated in Fig. 16.14. Now if there is an optical transition between the conduction band state E2 and a valence band state E1 , the photon energy must be hν = E2 − E1 to ensure energy conservation. Because the photon in circularly polarized light carries an angular momentum ±h¯ , the conduction band and the valence band should have a net angular momentum difference of h¯ , which they do – because the conduction band states derive from |si orbitals, and the valence band states from | pi orbitals. Finally, to ensure momentum conservation, we must have h¯ kc = h¯ kv + h¯ k p , where h¯ k p is the photon momentum. Since the electron states have |kc | ≈ |kv | ≈ aπ0 where a0 is a lattice constant, whereas the photon momentum h¯ |k p | = λh is much smaller.
362 Controlling Electron Traffic in the k-Space
Fig. 16.13 Since an atom allows sharp and discrete energy levels for electrons (say E1 and E2 as shown), if photons of a large spectrum of energies h¯ ω = hν are incident, only those photons that sharply match the energy conservation requirement E2 − E1 = hν are absorbed by the atom. An electron must occupy state E1 and state E2 must be empty for the absorption to occur. After absorption, the atom is in an excited state – it takes typically a nanosecond for the electron to spontaneously transition back from E2 to E1 , emitting the photon back. This process of spontaneous emission can be slowed down by placing the atom in an optical cavity of an appropriate size. The emitted spectrum is also sharp. A single atom can only emit one photon at a time. Compared to the atom, as we have discussed in earlier chapters, a direct-gap semiconductor crystal has many energies in bands, which derive from the atomic states. In an oversimplified picture, the eigenvalues of the bands at each k may be pictured as separate atomic states. The absorption is vertical in k. Since the density of states increases away from the band edges, more photons at each energy can be absorbed, leading to an absorption edge at the bandgap Eg , and increasing absorption for higher photon energies. The absorption spectrum depends on the semiconductor material as well as dimensionality: it is different for 3D, 2D, 1D, and 0D. After absorption, the semiconductor is in an excited state. An electron in a high energy state in the conduction band can directly recombine with the hole state in the valence band of the same k by emitting a photon, or it can emit phonons and relax down in the conduction band to lower energy states. Since phonon emission takes ∼ps whereas photon emission takes ∼ns, most electrons relax to near the bottom of the conduction band, and most holes relax to the top of the valence band. They then recombine to emit photons. The emission spectrum is therefore peaked near the bandgap, quite different from the absorption spectrum. The linewidth of the emission spectrum is also larger than the atomic states because of the large number of states available close in energy. A bulk semiconductor can emit several photons of the same energy at the same time because there are a large number of states that simultaneously match the energy and momentum criteria.
Fig. 16.14 Electron energies in bands for which optical transitions are allowed.
While this approximation is a very good one for IR, visible, and UV photons, it fails if the photon energy is too high. For example, a Xray photon can possess a substantial momentum comparable or even larger than h¯ /a0 where a0 is a lattice constant, and since such a photon also has a very large energy, it can excite core-level electrons into vacuum in addition to the electrons in bands. This is the principle of X-ray photoelectron spectroscopy or XPS measurements. Considering photons of smaller energies, we make the assumption k c = k v = k, i.e., the conduction and valence band states that talk to, i.e., absorb or emit photons must have the same k. Pictorially, this is what we call ”vertical” transitions in the E(k) − k diagram. In Fig. 16.14, the energy state E2 in the conduction band and one in the valence band E1
16.10 The optical joint density of states ρ J (ν) 363
of the same k are E2 = Ec +
h¯ 2 k2 h¯ 2 k2 , and E = E − , v 1 2m?c 2m?v
(16.37)
which leads to a photon energy hν = E2 − E1 = ( Ec − Ev ) +
h¯ 2 1 1 h¯ 2 k2 ( ? + ? ) k2 = Eg + , 2 mc mv 2mr?
(16.38)
where mr? = m?c m?v /(m?c + m?v ) is a reduced effective mass 18 . Thus, h¯ 2 k2
hν = Eg + =⇒ |{z} 2mr? | {z } light
h¯ 2 k2 2mr?
= hν − Eg .
(16.39)
18 The reduced effective mass m? acr
counts for the fact that both the valence and conduction bands participate in an interband transition
matter
Because the LHS of the boxed equation is positive, the above calculation reinforces our intuitive observation that only photons of energy hν equal or larger than the energy bandgap Eg will interact with the electron states19 . Unlike the atom, in a semiconductor there are a large number of transitions possible for various k values that produce a photon of the same energy hν, which we quantify in the next section.
19 We are neglecting excitonic, multipar-
ticle, and tunneling effects here and focusing on the strongest interband transitions.
16.10 The optical joint density of states ρ J (ν) How many such electron k states are available per unit volume of the semiconductor to emit, or absorb photons in the energy interval20 [hν, h(ν + dν)]? This quantity is of central importance in the evaluation of electron-photon interactions, and is called the optical joint density of states (JDOS) of the semiconductor, denoted by ρ J (ν). The number of states per unit volume is ρ J (ν)d(hν). The optical JDOS ρ J (ν) captures the property of light (through hν), and matter (through the semiconductor parameters the bandgap Eg , and band effective masses m?c and m?v ). The quantity ρ J (ν) will also reflect if the optically active semiconductor electron states are unconfined and free to move in 3D bulk, or confined in quantum wells (2D), quantum-wires (1D), or quantum dots (0D)21 . To count the number of electron states in 3D, assume the electrons are confined in a cubic box of size L x = Ly = Lz = L as shown in Fig. 16.15. Because an integer number of electron half-wavelengths must fit in the box, we get the condition k = (k x , k y , k z ) = πL (n x , ny , nz ), where (n x , ny , nz ) is a triplet of integers each of whose values can be 0, 1, 2, .... This defines a 3D lattice in the k = (k x , k y , k z ) space in the first octant, where each point denotes an allowed electron state, occupying a 3 volume ( πL )3 = πV . Because the volume of the semiconductor cube is V = L3 and we are interested in the JDOS ρ J (ν) per unit volume, we also define the total JDOS as D J (ν). The JDOS per unit volume is then related to the total JDOS by ρ J (ν) = D J (ν)/V.
20 We use h¯ ω = hν interchangeably.
21 Quantum
confined semiconductor structures are routinely used in practical semiconductor light emitting diodes (LEDs), semiconductor optical amplifiers (SOAs), optical modulators, and lasers that are discussed in Chapters 28 and 29.
364 Controlling Electron Traffic in the k-Space
We discuss the 3D optical JDOS first. The optical JDOS ρν (k) in the k-space counts the same states as in the energy space, meaning h¯ 2 2 ρk (k)dk = D J (ν)d(hν). Using Equation 16.39 in the form 2m ? (k x + r k2y + k2z ) = hν − Eg , we obtain ρ3D k ( k ) dk = 2 ·
1 4πk2 dk · = D J (ν)d(hν) 8 Lπx Lπy Lπz
=⇒ ρ3D J (ν) =
Fig. 16.15 Electron states in a cube of side L that can interact with photons.
1 2mr? 3 q ( ) 2 hν − Eg , 2π 2 h¯ 2
(16.40)
3D where ρ3D J ( ν ) = D J ( ν ) /V was used. The JDOS ρ J ( ν ) has units of 1/(energy · Volume) as may be verified from Equation 27.44. For semiconductors it is expressed in eV·1cm3 . The JDOS for quantum confined electron states is found in the same manner as for the 3D, except instead of a 3D k space, we have a 2D, or lower-dimensional k space. For example, for a quantum well laser, the electron and hole states are free to move in two dimensions ( x, y) because of large dimensions L x = Ly = L and confined in the third z direction because of a heterostructure quantum well confining potential to a length Lz smaller than the electron de-Broglie wavelengths, leading to the requirement k z = nz Lπz . This situation is realized if we create a structure in which Lz 0, the intensity Iν (z) grows with distance according to Iν (z) = Iν (0)e g0 (ν)z and we have photon gain, as indicated in Fig. 16.20. If g0 (ν) < 0, photons are absorbed, and one should refer to this quantity as the absorption coefficient. The prefactor α0 (ν) in Equation 16.53 for the gain coefficient that depends on the semiconductor JDOS and the photon properties is always positive. So in order to obtain gain, the Fermi difference function should meet the criteria f c ( E2 ) − f v ( E1 ) =
1 E2 − Fn kb T
−
1 E1 − Fp kb T
>0
1+e 1+e =⇒ Fn − Fp > E2 − E1 =⇒ Fn − Fp > hν > Eg .
(16.54)
Fig. 16.20 The photon (or light) density Iν (z) changes in a semiconductor as the wave propagates. If g0 (z) < 0, there is photon loss by absorption, which is central to the operation of photodetectors and solar cells. If g0 (z) > 0, there is optical gain. Optical gain is possible by stimulated emission, and is central to the operation of semiconductor optical amplifiers, and lasers.
370 Controlling Electron Traffic in the k-Space
The boxed condition Fn − Fp > hν > Eg is the population inversion criteria for semiconductors, or the condition to achieve optical gain. It is also referred to as the Bernard–Duraffourg condition after those who first identified it. Fig. 16.21 indicates the semiconductor gain coefficient at two excitation conditions. Since in a pn diode Fn − Fp = qV, it is necessary to apply a voltage larger than the effective bandgap of the optically active region to obtain gain. For photon energies matching Fn − Fp = hν, the net gain coefficient g0 (ν) = 0, meaning the semiconductor medium is transparent to those photons, as seen in Fig. 16.21.
Fig. 16.21 The gain spectrum of a bulk semiconductor at non-equilibrium conditions. When the quasi-Fermi level Fn − Fp splitting is smaller than the bandgap, the gain spectrum is simply g0 (ν) = −α0 (ν), the absorption coefficient. When Fn − Fp > hν > Eg , population inversion is achieved and the gain spectrum is +ve over a window of photon energies Eg < hν < Fn − Fp . For higher photon energies, there is absorption and loss.
23 The exponential decay of light inten-
sity in a material medium is called the classical Beer–Lambert law.
Note that this is an induced transparency, meaning if the semiconductor had been in equilibrium, it would normally absorb photons of energy hν = Fn − Fp > Eg because it is larger than the bandgap. For photon energies exceeding this quasi-Fermi level split, the Fermi difference function is negative, leading to loss. Thus, the semiconductor E F −F gain spectrum has a bandwidth hg ≤ ν ≤ n h p , and the shape g0 (ν) dictated by the product of the optical JDOS, and the Fermi difference function. The Fermi difference functions are shown for a low-level carrier injection into the pn junction when Fn − Fp < Eg , and a high level injection when Fn − Fp > Eg , together with the corresponding gain spectra. If photon energies that fall in the gain window are reflected back and forth between mirrors, the gain builds up and exceeds the loss, resulting in stimulated emission. This is the working principle of the semiconductor laser, the topic of Chapter 29. We will discuss these photonic properties of bulk and low-dimensional semiconductor quantum structures in quantitative detail in the chapters of Module IV. It is clear that at equilibrium, or for small levels of carrier injection when Fn ≈ Fp = EF , the factor f c ( E2 ) − f v ( E1 ) ≈ −1, and we obtain the optical absorption coefficient of the semiconductor α0 (ν) = λ2
dI (z)
ν = A · 8πn0 2 · hρ J (ν), whereby the intensity of photons changes as dz −α0 (ν) Iν (z) as they are absorbed. This leads to an intensity variation Iν (z) = Iν (0)e−α0 (ν)z , implying that the intensity of light decreases exponentially with depth if the photons are absorbed. This decay happens over a characteristic length scale of α0 (z), which is defined as the absorption length23 . Returning to spontaneous emission, we obtain a spectrum
Rsp (ν) = Aρ J (ν) f c ( E2 )[1 − f v ( E1 )]
= Aρ J (ν)
1 1+e
E2 − Fn kb T
·
e
E1 − Fp kb T
1+e
E1 − Fp kb T
≈ Aρ J (ν)e
− khνT b
e
Fn − Fp kb T
.
(16.55)
This condition is important for LEDs, because it indicates that by splitting the quasi-Fermi levels Fn − Fp = qV with a voltage, we exponentially increase the spontaneous emission rate of photons. Of course these photons must be extracted before they get re-absorbed again, because though absorption rate is smaller, it is not negligible! Unlike the
16.13
Chapter summary section 371
laser, the LED can operate at much smaller levels of current injection well below the conditions of population inversion. The photons emitted by a LED are due to spontaneous emission, which is why they lack the coherence and directivity of those from a laser. Incoherent light from LEDs finds as much use as coherent light from lasers.
16.13 Chapter summary section In this chapter, we learned:
• • • •
The basic principles of semiconductor physics, Fundamental concepts for semiconductor electronic devices, Fundamental concepts for semiconductor photonic devices, and And are ready to put the physics to use in applications in electronic and photonic devices in Modules III & IV!
Further reading Since this chapter serves as a condensed version of the entire subject, I have recommended several books of previous chapters. For Modules III and IV, I make a few more suggestions, which will be repeated in relevant chapters. For semiconductor devices, Physics of Semiconductor Devices by Simon Sze is the most comprehensive and remains as relevant today as when it was written. I strongly recommend Wave Mechanics applied to Semiconductor Heterostructures by Gerald Bastard, where the empha-
sis is less on devices, but more on the quantum physics. For those who want to delve deeper into the quantum mechanical foundations of semiconductors, I recommend Quantum Theory of the Optical and Electronic Properties of Semiconductors by Haug and Koch. They have taken a rigorous approach to the subject, and cover several optical phenomena that are not discussed in this book.
Exercises (16.1) Reading exercise Since this chapter is a summary of the past and future topics of the book, no exercises are assigned specific to this chapter. To take advantage of it acting as a bridge from the physics to device applications, a few reading exercises are assigned.
(a) William Shockley, a pioneer of semiconductor physics and the co-inventor of the transistor wrote a review article in 1952 called Transistor Electronics: Imperfections, Unipolar and Analog Transistors. This article captures the physics of semiconductors at a heuristic level that remains
372 Exercises unmatched to date. Find and read this gem of an article. You will see the thought processes that underpin the subject require minimal mathematical complications to understand. You will also encounter how the names source, drain, and gate were introduced! (b) An excellent (and very thick!) book on semi-
conductor devices is Semiconductor Devices: Pioneering Papers, a collection by Simon Sze. This collection is a treasure trove of papers dating back to the 1800s. Those who want to learn of the true history behind today’s semiconductor devices will appreciate reading the early papers, and who knows, it may inspire you to invent something new!
Part III
Quantum Electronics with Semiconductors
Game of Modes: Quantized R, L, and C
17
Traditional electronic circuit design is based on understanding and controlling the roles of resistors (R), inductors (L), and capacitors (C). Since the three RLC circuit elements are passive, they can store and flow electrical energy, but not indefinitely because they are unable to amplify electrical signals to counteract and recover the losses in the resistor. Recovery of the signal requires electronic gain. Historically, electronic gain was found in vacuum devices, which have been mostly miniaturized and replaced by semiconductor-based active devices. In the following chapters, we discuss the quantum physics of diodes and transistors, the primary semiconductor devices that provide electronic gain. In this chapter, we focus on the quantum physics of the passive elements. As long as the wavelength λ = c/ f of an electrical signal of frequency f (c is the speed of light) is much larger than the dimensions of a resistor, capacitor, or inductor, these passive elements behave as ”point devices”, or lumped elements of R, L, C. As electronic circuits are miniaturized and their operational frequencies increase to higher speeds, the quantum limits of resistance (or conductance), inductance and capacitance (Fig. 17.1) assume increasingly important roles. In this chapter, we learn the effect of the number of modes M, group velocities v, and density of states (DOS) on:
• Quantized conductance Gq =
q2 h M
and resistance Rq =
• Quantum capacitance Cq = q2 · DOS = • Kinetic inductance Lk ∝
m? q2 n
=
h 1 . q2 h2Mvi
q2 2M h h v i,
17.1 Classical R, L, C circuits
376
17.2 Quantized conductance
377
17.3 Quantum capacitance
382
17.4 Kinetic inductance
388
17.5 Quantum R, L, C circuits
392
17.6 Negative R, L, and C
393
17.7 Chapter summary section
397
Further reading
397
Exercises
397
h , Mq2
and
We connect each of the these three concepts to the electronic bandstructure E(k ) and the number of conducting modes M of the semiconductor. It is important to realize that in a typical circuit the R, L, C is not just the three quantum parts discussed in this chapter, but includes the classical values either in series or in parallel as shown in Fig. 17.1. The point of this chapter is to emphasize that for modern electronic devices, one needs to consider the quantum counterpart to get the physics right. Though in the past the quantum components were negligible, they will not remain so in the future.
Fig. 17.1 A classical transmission line showing the resistor R, the inductor L, and the capacitor C, and the additional elements in the quantum version. In this chapter, we explore the quantum versions of these three passive circuit parameters for semiconductors.
376 Game of Modes: Quantized R, L, and C
17.1 Classical R, L, C circuits A cylindrical conductor of length L0 , diameter d, a distance h from a ρ ground plane has a classical resistance R = A · L0 , capacitance C = µ ln ( h/d ) 2πe0 · L0 , and inductance L = 0 2π · L0 , relations that may be ln (h/d) found in a textbook on electromagnetism or electronic circuits. The linear increase in these three classical quantities with length defines ρ 0 the per unit length quantities R = LR0 = A , C = LC0 = ln2πe , and (h/d)
Fig. 17.2 A differential section of a transmission line showing the classical resistance, inductance, capacitance, and conductance, each of which scale linearly with the length of the section. The second figure shows that when resistance is measured down to the nanometer scales, it stops scaling linearly below roughly a mean free path of the electron.
(h/d) L = LL0 = µ0 ln2π . The product LC = e0 µ0 = c12 is inverse of the square of the speed of light c = 3 · 108 m/s. Now imagine connecting a voltage source that applies a voltage V (t) = V0 eiωt and injects a current I (t) = I0 eiωt at a circular frequency ω = 2π f into the left side of a long conductor, which is modeled as a RLC network shown in Fig. 17.1. The voltage and current signals propagate to the right, and will have a spatial distribution V ( x ) and I ( x ) at a distance x from the voltage source. To find this variation, consider Fig. 17.2, which shows a differential element of the wire of length dx, a distance x from the source on the left. The voltage drop across the length ∂x is ∂V ( x ) = −(R + iω L) · ∂x · I ( x ). We allow for the possibility of some current leakage from the wire to the ground through the conductance per unit length G in parallel with the capacitance. The change in current across dx is ∂I ( x ) = −(G + iω C) · ∂x · V ( x ). Combining these, we obtain the celebrated telegrapher’s equation:
∂V ( x ) ∂I ( x ) = −(R + iω L) · I ( x ) and = −(G + iω C) · V ( x ) ∂x ∂x
=⇒ γ=
q
∂2 V ( x ) ∂2 I ( x ) = γ2 · V ( x ) and = γ2 · I ( x ) 2 ∂x ∂x2
(R + iω L) · (G + iω C) =⇒ γ02 = −ω 2 LC = −(
V ( x ) = V+ e−γx + V− e+γx and I ( x ) = Fig. 17.3 Oliver Heaviside (1850-1925), a self-educated electrical engineer, developed the telegrapher’s equation as the current-voltage version of Maxwell’s equations. He introduced the mathematical concepts of the unit-step (or Heaviside) function and the impulse (or Diracdelta) function. Heaviside was quite enamored by the power of Maxwell’s equations, which in its original formulation had too many variables and symbols. The neat four-equation modern formulation with divergences and curls of E and B fields is actually due to Heaviside. He coined the word ”inductance”.
ω 2 ) c
1 (V+ e−γx − V− e+γx ) Z r R + iω L Z= , (17.1) G + iω C
which is how electrical engineers disguise Maxwell’s equations, as first formulated by Heaviside (Fig. 17.3). The spatial variation of the real values of voltage and current at x at any time t is obtained from the complex solution by V ( x, t) = Re[V ( x ) · eiωt ]. The electrical signal of the source propagates as an electromagnetic wave with wavevector γ, which for the lossless transmission line R = G = 0 in vacuum is γ0 = i ωc = i 2π λ , implying the signal propagates at the speed of light in the circuit, much faster than electrons can move, whose group velocities in crystals are limited to vmax ≤ 106 m/s, ∼ 300× slower than g light. The ratio of the voltage and current is a characteristic impedance.
17.2 Quantized conductance 377
q q µ0 ln (h/d) For the lossless case it is1 Z = L = e0 · 2π . The lower part of C Fig. 17.2 motivates the purpose of this chapter: the resistance is found not to scale linearly with the length if the conductor length becomes smaller than the electron mean free path λmfp for scattering. As indicated in Fig. 17.1, this experimental fact, and the analogous limits of capacitance and inductance necessitates adding the quantum versions of R, L, and C to the transmission line model to get the physics right. We now discuss them, starting with quantization of conductance.
1 We have treated the specific case
of a cylindrical conductor here. The impedance and other characteristic quantities of the transmission line depend on the specific geometries. The failure of the lumped circuit model also occurs for high frequencies at which the electromagnetic wavelength is comparable to the size of the R, L, or C.
17.2 Quantized conductance Since the classical resistance is R = ρ · LA0 , in the 1980s ultrapure semiconductor heterostructures and microscale fabrication techniques made it experimentally possible to answer a vigorously debated question: what would happen in the limit of very low impurities when ρ → 0, and very small device structures when L0 → 0? If one removed all scattering, will the resistance approach zero, and conductance to ∞? There were several competing theoretical expectations, till the experimental observation shown in Fig. 17.4 cleared the air. Fig. 17.4 shows a modulation-doped semiconductor heterostructure formed between AlGaAs and GaAs with a type-I band alignment. Donor atoms are placed in the AlGaAs layer with a spacer layer, and the GaAs is undoped. Based on the physics discussed in Section 15.6, a 2D electron gas (2DEG) forms at the heterojunction. Because the mobile electrons are physically separated from the ionized dopant scatterers, there is a significant boost in the mobility, especially at low temperatures. For example, the electron mobility measured in this experiment was µ = 8.5 × 105 cm2 /V·s at T = 0.6 K, at a 2DEG density n2d = 3.6 × 1011 /cm2 . In Chapters 21 and 23, we will quantitatively evaluate the mobilities; here, we focus on the high-mobility, or the ballistic limit. When the semiconductor heterostructure was processed with split gates on the surface, and the current through the 2DEG measured as a function of the voltage applied on the gate, the resistance was found to go through a series of well-defined steps. The conductance, as seen in the bottom panel of Fig. 17.4 was found to be quantized in units of G = M × (2q2 )/h, where M is an integer. The value of the integer M changes with the gate voltage, whereas the magnitude of the electron charge q and Planck’s constant h are fundamental constants. This measurement has since then been repeated in several semiconductors: in silicon, InGaAs, GaN and others, and is a universal feature in high mobility semiconductors. It is the quintessential example of quantized conductance in the ballistic limit of transport. Fig. 17.5 shows the evolution of experimentally measured conductance over nearly two centuries, from Ohm’s law to ballistic conductors. In late 1960s, Sharvin had observed that the conductance became independent of the length when the length of the conductor
Fig. 17.4 The quantization of conductance in high mobility semiconductor heterostructure point contacts, adapted from Phys. Rev. Lett. 60, 848 (1988). Note that the conductance is quantized at B = 0, making this form of conductance quantization different from the quantum hall effect, though there are some similar looking features. The differences will be discussed in Chapter 25.
378 Game of Modes: Quantized R, L, and C
Fig. 17.5 The classical ”Ohm’s” law 2D conductance G = σ W L was found to fail in explaining the lack of dependence on length when the distance between contacts is smaller than the mean free path of electrons (Sharvin, 1970). Then, when the electron wavelength is of the order of the width, quantized conductance is observed (1988, Fig. 17.4). 2 The
quantization of conductance, though a revelation at the time – was observed 8 years after the observation of the integer quantum Hall effect, in which very precise conductance quantization, also in units of Nq2 /h is measured in 2DEGs. However, the Hall conductance is the transverse conductance σxy in the presence of a magnetic field, whereas what we are discussing here is the standard two-terminal conductance σxx or resistance with no magnetic field. The quantum Hall effect is discussed in Chapter 25.
was smaller than an electron mean free path, but it remained proportional to the width, or cross-sectional area of the conductor. When the width of the conductor became of the order of the electron wavelength, W only M =Int[ λ/2 ] modes whose wavelengths λ fit in the point contact width can transmit, the others must reflect. For a Fermi wavevector k F = 2π/λ F , the number of modes are M =Int[ k FπW ] in 2D and k2 A
M =Int[ πF 2 ] in 3D. This picture of conductance is similar to sound or light wave propagation in waveguides, as shown in Fig. 17.6. That the conductance is quantized in units of 2q2 /h should not be that big of a surprise to you: in Chapter 5, Equation 5.40 we learned that this must be the case for a 1D conductor with free–electrons, in the absence of scattering2 . The group velocity of a mode k of the energy band is v g = h¯ −1 ∇k E(k), and therefore the current I=
qgs gv L
∑ v g (k) f (k) = k
E
−E
qgs gv h¯ L
Z +∞ dk ∂E(k) −∞
2π L
∂k
f (k)
1 + e Fskb T F qgs gv I q2 I= (k b T ) ln =⇒ @T = 0 → = g g (17.2) s v EFd − EF 2π¯h V h 1 + e kb T
Fig. 17.6 Current flow driven by the Fermi level difference by filling states from the source contact and emptying states into the drain contact. The maximum conductance is limited by the numW ber M =Int[ λ/2 ] or Int[ (λx /2W)·(Hλy /2) ] of electron half-wavelengths that fit in the cross-section of the conductor, each providing a conductance of q2 /h.
is the product of the velocity of the modes and the occupation function, summed over all k-states. In the ideal 1D situation, the number of cross-sectional modes is by definition M = 1, the spin-degeneracy produces gs = 2 states of equal energy, and the valley degeneracy gv due to the crystal structure can produce further copies of equal-energy states at different k points in the Brillouin zone. Fig. 17.6 shows the electron (and hole) bandstructure of the conductor. The occupation function is controlled by the electrochemical potentials µ1 = EFs and µ2 = EFd of the source and the drain contacts, and the difference is µ1 − µ2 = EFs − EFd = qV, where V is the applied voltage. For 1D, the group velocity cancels with the density of states, leading to conductance quantization that is independent of the exact nature of the bandstructure E(k). As the split gate voltage is swept, lateral depletion changes the electronic width of the constriction, changing the number of 1D modes that fit the width. Each mode that fits contributes a conductance quantum, explaining the steps observed in Fig. 17.4. We can
17.2 Quantized conductance 379
then generalize the ballistic conductance for different dimensions: Gq = gs gv
q2 W W H M , M = |{z} 1 , = Int[ ], = Int[ · ], h λ F /2 λ Fx /2 λ Fy /2 | {z } 1D | {z } 2D
3D
(17.3) where λ F is the Fermi wavelength of electrons in the highest filled states. The cross-section into which the half-wavelengths must fit is of one lower dimensionality than the conductor itself. This viewpoint is consistent with the ballistic current density we had obtained in Chapter 5 for a conductor with parabolic bands in d-dimensions, where the lower dimensional effective DOS Ncd−1 appears explicitly: Jd =
q2 k T E − Ec E − Ec − qV · Ncd−1 · b · [ F d−1 ( Fs ) − F d−1 ( Fs )]. (17.4) 2 2 h q kb T kb T
We discuss the experimental details of the step-like change of conductance to bring out the important interplay of scattering, electron mean free paths, quasi-Fermi levels, and non-equilibrium physics that will be central to electron transport phenomena discussed in later chapters. Landauer (Fig. 17.7) had pioneered viewing electron transport not as the Drude picture of electrons as classical particles scattering from point defects, but as modes of electron waves that undergo reflection or transmission across the potential barriers caused by scattering sites. For a quantitative feel of length scales, the Fermi wavelength λ F , and mean free path λm f p for a 2DEG of carrier density n2d and mobility µ are: s r 2π 4πn2d πgs gv 25 nm = kF = =⇒ λ F = = q λF gs gv n2d (n2d /1012 /cm2 ) qτ h¯ k µm? λmfp ∼ v F τ, where µ = ? , λmfp ∼ ?F · m m q s h¯ µ 4πn2d =⇒ λmfp ∼ · q gs gv r µ n2d =⇒ λmfp ∼ [( · 16.5] nm .(17.5) 2 )· 12 cm 10 cm2 103 V·s
where the numerical formulae are for gs = 2 and gv = 1, true for the GaAs heterostructure. For a starting 2DEG density n2d = 3.6 × 1011 /cm2 used in Fig. 17.4, the Fermi wavelength is λ F ≈ 40 nm – much larger than atomic dimensions, which makes it possible for the gate voltage to control a small number of electron modes to pass through the point contact, leading to the conductance quantization. For µ = 8.5 × 105 cm2 /V·s, the mean-free path is λm f p ∼ 8.4 µm. Fig. 17.8 indicates the range of Fermi wavelengths and mean-free paths encountered in typical semiconductor structures. Note that the Fermi
Fig. 17.7 Rolf Landauer in 1950s at IBM realized the role of the wave nature of electrons on the electrical and thermal conductance in quantized structures, and contributed to quantifying the physics of information.
10
-4
10
-5
10
-6
10
-7
10
-8
10
-9
10
-10
0.01
0.1
1
10
100 101
10
2
10
3
10
4
Fig. 17.8 Fermi wavelengths and mean free paths as a function of the 2D electron gas density (left) and electron mobility (right). For example, the mean free path of electrons at the Fermi level at a density of n2d = 1012 /cm2 for a mobility of 10, 000 cm2 /V·s is λm f p ∼ 165 nm, about 8× larger than the Fermi wavelength of λ F ∼ 25 nm, both of which are obtained from Equation 17.5. The shaded region indicates that higher 2DEG densities enable regimes in which the mean free path is larger than the Fermi wavelengths. Note that while the mean free path λm f p depends on the mobility and bandstructure, the Fermi wavelength λ F does not: it is purely a function of the electron density.
10
5
10
6
380 Game of Modes: Quantized R, L, and C
wavelength is independent of electron transport properties and the details of the semiconductor bandstructure, it depends solely on the density of electrons. Fig. 17.9 quantitatively indicates how Landauer’s picture of a ballistic conductor evolves to a ”drift-diffusion”, or classical conductor in the presence of increasing scattering in the channel. Consider the conductor of length L with no scatterers. A voltage difference of V between the contacts creates two populations of carriers in the channel: those in equilibrium with the source, which share its quasi Fermilevel µ+ = µ1 and constitute carriers moving to the right, from the source to the drain. The second population of electrons move to the left, and have a quasi-Fermi level µ− = µ2 , the Fermi level of the drain. Since in the ballistic case (a) there is no scattering, the quasiFermi levels stay constant from the source to the drain since no rightmoving carrier scatters to become a left-moving carrier, and no leftmoving carrier scatters to the right. The electrochemical potential that characterizes the total carrier concentration at z is the average µ(z) = 12 (µ+ + µ− ). The current of Equation 17.2 satisfies the current continuity and boundary conditions:
µ+ − µ− dI ), and = 0, and q dz µ+ (z = 0+ ) = µ1 and µ− (z = L− ) = µ2 , I = Gq (
Fig. 17.9 The evolution of energy diagrams showing the evolution of the spatial variation of quasi-Fermi levels from the ballistic (a) to a single scatterer (b), to a quasi-diffusive (c), to a completely diffusive (d) regime of transport. A single mode-based approach accounts for the conductance of all cases. The ballistic picture of wave-fitting leads to effective voltage drops at the contacts, and scatterers introduce additional voltage drops, as indicated by the resistor network circuit diagrams. For very long conductors, the difference of the quasiFermi levels is too small, and a single µ(z) as shown in (d) is used to capture the transport and electrostatics. 3 Coherent scattering and interference ef-
fects can lead to resonant tunneling and other exotic phenomena, as discussed in Chapter 24.
(17.6)
where the ballistic conductance Gq is given by Equation 17.3. In steady state, current continuity implies I =constant in z. For the ballistic case of Fig. 17.9 (a), µ(z), the average of µ+ and µ− , drops abruptly at the source and drain contacts, and stays constant in the channel, implying there is no voltage drop in the channel. Writing Rq = 1/Gq , the voltage drops by IRq /2 at each contact, as if there were two resistors Rq /2 at each contact, indicated by the circuit diagram. Fig. 17.9 (b) shows how the situation changes upon the introduction of a single scatterer. A fraction T of the right-going carriers manage to transmit across the scatterer, whereas 1 − T are reflected back towards the source. This is captured in the drop in µ+ moving to the right across the scatterer, and the corresponding increase in µ− crossing the scatterer to the left. The physical meaning is clear: the electrons that are reflected back to the left from the scatterer increase the leftgoing quasi-Fermi level. The drop in the total electrochemical potential across the scatterer indicates that it introduces an additional resistance Rq ( 1−T T ). Fig. 17.9 (c) shows a ”zoomed-out” picture for the case when a large number of such scatterers are introduced throughout the channel. Writing the mean-free path as λm f p = λ, the net resistance in the channel becomes Rq λL , related to the single scatterer of part (b) via λ 3 T = L+ λ , if coherent scattering and interference effects are neglected .
17.2 Quantized conductance 381
Thus for both the diffusive and ballistic regimes,
µ+ − µ− q
λ · ( µ1 − µ2 ) L+λ Gq λ µ1 − µ2 µ+ − µ− = IRq , and I = Gq ( )= ( ). q L+λ q µ+ − µ− =
(17.7)
In the diffusive limit, when there is a gradient in the electrochemical potential as seen in Figures 17.9 (c) and (d), the current is equivalently found from the equation I=−
Gq λ dµ Gq λ µ1 − µ2 σA dµ V =− ≈ ( )= , q dz q dz L+λ q Rq (1 + λL )
(17.8)
which is useful especially for very long channels when µ+ − µ− = λ + − L+λ ( qV ) ≈ 0, and the I = Gq ( µ − µ ) /q form, though still correct, is not practical to use. Fig. 17.9 (d) shows a long-channel FET that we will encounter in Chapter 19, for which the separation between µ+ and µ− is negligible, yet µ(z) = (µ+ + µ− )/2 shown has a spatial variation distinct from the band-edge Ec (z). The current is found from the gradient of µ(z), not Ec (z). To summarize4 , accounting for the quantized conductance makes the resistance of a conductor R( L) = Rq (1 + L/λ). When L → 0, the resistance approaches Rq as was indicated in Fig. 17.2 for small conductors. Whether one should use the diffusive limit of Equation 17.8 or ballistic (or quasi-ballistic) limit of Equation 17.7 depends on the problem. The concept of quasi-Fermi levels will prove to be of great importance in following chapters. Here, we encountered the quasi-Fermi levels of right-going and left-going carriers. In Chapter 18 we will use quasi-Fermi levels to understand the operation of ballistic Schottky diodes. We will then introduce two quasi-Fermi levels, one for electrons in the conduction band and the other for holes in the valence band to understand the operation of pn junctions, and bipolar transistors in the diffusive transport regime. In Chapter 19 we will use quasi-Fermi levels for both ballistic and diffusive regimes to understand the operation of field-effect transistors. In Chapter 21 we will describe the behavior of the quasi-Fermi levels with the Boltzmann transport equation, and use them in Chapters 27, 28, and 29 to understand light-matter interaction, and photonic device operation. How does the conductance scale down to the last atoms? If there was a single atom, or its semiconductor counterpart, a heterostructure quantum dot, connected to a source reservoir at Fermi level EFs and a drain reservoir at EFd , the maximum conductance for an electron to transport through a single discrete level of a quantum dot is also G = 2q2 /h. The reason for this is highlighted in Fig. 17.10. The coupling of the discrete energy eigenvalue to the electrodes causes an effective broadening of the energy level, which is a way of saying that the energy is not a state of definite energy since an electron put in it can leak out to the drain. A single energy level can lead to on-site Coulomb
4 Equations 17.2 and 17.3, together with
17.7 and 17.8 convey the Landauer picture of conductance when the role of inelastic scattering (e.g. by phonons) is neglected. The role of elastic scatterers was first identified by Landauer, and those of the contacts by Yoseph Imry. Our earlier discussion in Chapter 5 of conductance quanta and current saturation in the ballistic regime of transport for free–electrons carries over to transport in crystals directly, with the free–electron mass replaced by the band-edge effective masses, and the zero energy by the band edges. Exercise 17.2 gives you an opportunity to practice and become comfortable in this way of thinking. In 1D, the effective mass does not appear explicitly, but it does in 2D and 3D transport. The 2D and 3D ballistic transport cases may be derived by considering them as parallel arrays of 1D conductors, and using the simple conductance quantum 2q2 /h for each mode with the correct Fermi level differences.
Fig. 17.10 Conductance between two electrodes through a single atomic level, or a quantum dot. The maximum conductance is 2q2 /h. Due to the perturbation of the eigenvalue of the quantum dot by wavefunction overlap with those of the two electrodes, the sharp eigenvalue of the isolated dot is broadened, making it not a state of definite energy, and allowing transport through it.
382 Game of Modes: Quantized R, L, and C
repulsion and many-particle effects, similar to the donor and acceptor dopants discussed in Chapter 15. Though a 2D conductor is the most popular, and 1D versions are increasingly used for efficient transport of carriers in transistors, the quantum dot version is also increasing in importance in several applications ranging from memories, to LEDs and lasers.
17.3 Quantum capacitance The capacitance C of a device is a measure of its capacity to store electrical energy U in the form of charge Q. The classical capacitance of a conductor depends solely on its geometry, because the electrical energy is stored in the small displacements of the electron clouds in the insulating, or dielectric regions in the space outside the conductor(s). Since the DC electric field inside a perfect conductor is zero, no energy is stored inside the conductor for the classical capacitor. This classical picture is incomplete in nanoscale capacitors: a part of the electrical energy is also stored inside the conductor. This property of a conductor is called its quantum capacitance. The electrical energy divides between the quantum and classical geometric capacitances. To add a charge dQ, a voltage dV must drop across the electrodes connected to the device. The ratio C = dQ/dV is defined as its capacitance. R RThe total charge Q is related to the total voltage drop V by dQ = C dV. To put a total charge Q = CV on the device, an energy U=
Z V
V =0
or U =
5 High-κ
dielectric materials enable higher capacitance, as may be directly seen for the parallel plate capacitor C = e0 κ Ad of thickness d and area A. Viewing a capacitor as a device that stores energy, high-κ materials enable compact batteries. Viewing it as a device that stores charge and helps manipulate charge with better energy efficiency, high-κ dielectrics are desired for both semiconductor memories, and in field-effect transistors, the logic element.
Q dV = Z V
V =0
Z V
V =0
Q dV =
1 CV 2 , 2 Q0 Q2 dQ0 = C 2C
CV 0 dV 0 =
Z Q
Q =0
(17.9)
must be provided to the capacitor. Experimentally a battery provides the external voltage V, and the charges Q respond to this voltage. Since U = 12 CV 2 , to store a larger amount of energy for the same voltage drop, a higher capacitance5 is desired. As standard textbook examples, the classical capacitance of a parallel plate capacitor of thickness d and area A is Ck = e0 κ Ad , Csph = 4πe0 κR for a sphere of radius R, and is 0κ Ccyl = ln2πe L for a cylindrical conductor of length L, and diameter d (h/d) a distance h from a ground plane. To find the quantum capacitance of a conductor, we first note that electrons occupy allowed energy eigenvalues E(k) of the conductor up to the Fermi level, or an electrochemical potential µ, with a Fermi–Dirac distribution f k in energy. The change of the total charge Q = q ∑k f k
17.3
Quantum capacitance 383
with the electrochemical potential is the quantum capacitance: 1
fk =
1+e
E(k)−µ kb T
, Q = q ∑ f k =⇒ k
∂ fk ∂µ |{z}
∂Q = q2 ∑ ∂(µ/q) k
1 E(k)−µ 4k b T cosh2 [ ] 2k b T
=⇒ Cq =
q2 4k b T
Z
dd k 1 · ≈ q2 · g ( µ ) · L d . 2π d ( L ) cosh2 [ E(k)−µ ] |{z} 2k T T →0 b
(17.10)
This general formulation provides the quantum capacitance Cq of any conductor whose bandstructure E(k) is known. The resulting Cq per unit ”volume” is found to be q2 × g(µ), i.e., the electron charge squared times the DOS at the Fermi level6 . For example, for metallic carbon nanotubes, E = h¯ v F |k|, and Cq =
q2 gs gv · 4k b T
Z +∞ dk
−∞ ( 2π L )
·
1 cosh2 [
h¯ v F |k |−µ 2k b T ]
= L·
2gs gv q2 1 · . µ −k T hv F b 1 + e | {z } | {z } 2 q ·DOS
6 See Exercise 17.3 for an equivalent
formulation of quantum capacitance in terms of conducting modes, Cq = q2 h
h 2M v i.
µ>>k b T =⇒ ≈1
(17.11) Similarly, for a 2DEG with a parabolic bandstructure E(k x , k y ) = h¯ 2 2 2 Ec + 2m ? ( k x + k y ), the quantum capacitance evaluates to c
Cq =
q2 gs gv 4k b T
=
Z +∞ Z +∞ dk x dk y −∞
q2 g
L x Ly 4k b T
−∞ ( 2π Lx )
( 2π Ly )
·
1 2
cosh [
Z 2π Z +∞ kdkdθ s gv
θ =0 k =0 (2π )2
= L x Ly ·
q2 g
·
Ec +
h¯ 2 (k2x +k2y ) −µ 2m? c 2k b T
]
1 cosh2 [
? s gv m c 2
| 2π¯ {zh
q2 ·DOS
·
h¯ 2 k2 −( µ − E ) c 2m? c
2k b T
} 1+e
1 −(
µ − Ec kb T
)
] ,
(17.12)
which for µ − Ec >> k b T again yields the general form Cq /A = q2 · DOS. As long as the Fermi level is a few k b T inside a conducting band (either conduction or valence band)7 , the semiconductor is degenerate, the quantum capacitance is faithfully reproduced by Cq /A = q2 · DOS. How is the capacitance C related to the atoms and materials and their geometry that make the device? Microscopically, where exactly is this energy stored? For the classical, or geometric capacitance the electrical energy U = 12 CV 2 is stored in the displaced negatively charged electron clouds with respect to the positively charged nuclei of the insulating dielectric, or material medium filling the space between the conductors. The displacement is due to the force exerted on the charges by the external electric field, which perturbs them from their
7 Note that when µ− Ec VT , the Fermi level rises to EFs > E0 , the sheet density ns ≥ Nc2d is high, and neglecting the −1 in the LHS of Equation 17.17 we obtain 1 1 V − VT 1 + 2d )ns ≈ =⇒ ns ≈ 1 (V − VT ) nb Vth Nc ( n + 12d )Vth ns ≈
1
( C1b
+
Vth ) Nc2d
(V − VT ) ≈ =⇒ ns ≈
-0.6 -0.4 -0.2 0.0 0.2 0.4 0.6
=⇒ Cq = Fig. 17.13 Illustration of the errors incurred by neglecting the quantum capacitance. The values used are eb = 20e0 , tb = 3 nm, m? = 0.2me , gs = gv = 2, and T = 300 K. This represents for example a metal-HfO2 -silicon M-I-S capacitor. Note the error at high V − VT if the Cq is neglected. This regime of operation represents the on-state of a transistor if the semiconductor is used as the conducting channel.
.
The dependence of ns on V is seen explicitly in this transcendental equation. We can find two limits of the sheet density: if ns > k b T, or in the high conductivity ”on” state of the semiconductor. Inclusion of the quantum C C capacitance via the ns = C b+Cq q (V − VT )/q approximation matches the b exact dependence of ns on V captured by Equation 17.17 in the ”on”state. However, neither of the approximations can capture the low conductivity ”off” state for V − VT / 0, because when the Fermi level EFs enters the bandgap, the semiconductor behaves as an insulator rather than a metal14 . If tb ≈ ∆, neglecting the quantum capacitance leads to large errors, whereas for tb >> ∆ the error is negligible. Since the DOS can change in different dimensions, the relative interplay of Cq and Cb can √ be significantly altered. For example, since in 1D, the DOS g1d ∼ 1/ E − Ec has a van-Hove singularity at the band edge and drops rapidly, it can dominate over Cb when it becomes Cq > +1. The resulting values, and the corresponding quantum inductances are: 1D, d = 1 : γ1 = 1, L1d k ≈ L· 2D, d = 2 : γ2 =
m?c q2 n1d
π2 L π2 m? , L2d ≈ · · 2 c k 8 W 8 q n2d
3D, d = 3 : γ3 =
4 3d L 4 m? , Lk ≈ 2 · · 2 c . 3 W 3 q n3d
(17.30)
The quantum inductance per unit ”length” for parabolic bandstructure condutors therefore is proportional to m? /q2 nd up to a factor of unity for all dimensions in the ballistic limit18 . Table 17.1 shows a few examples of the classical and quantum R, L, and C values. In this table, the respective conduction band-edge density of states for parabolic ? d d −2 p gs gv bandstrcuture is gd ( E) = ( 2m2c ) 2 ( E − Ec ) 2 , and for conical d bandstructure is gdc ( E) =
(4π ) 2 Γ( d2 ) h¯ gs gv 1 E d −1 . (h¯ v F )d d−1 d2 d 2 π Γ( 2 )
The corresponding va-
lence band-edge density of states are written down by analogy. There are a few missing spots in the Table: you can follow Exercise 17.6 to fill in these spots.
18 Identical formulae are obtained for
both the quantum capacitance and the kinetic inductance if the problem is solved in the scattering dominated regime using the Boltzmann transport equation of Chapter 21.
392 Game of Modes: Quantized R, L, and C
17.5 Quantum R, L, C circuits Let us consider a metallic 1D carbon nanotube to gauge the importance of quantum capacitance and kinetic inductance. Fig. 17.16 shows a schematic figure of a ”classical” conducting wire as a transmission line, connected to an AC voltage source, similar to the classical RLC circuit of Section 17.1 earlierpin this chapter. The classical wave impedance is indicated as Z0 = µ0 /e0 ≈ 120π = 377 Ω. Also shown is a ballistic metallic carbon nanotube for comparison, which has a classical capacitance and inductance per unit length of roughly C ≈ 50 aF/µm and L ≈ 1 pH/µm. The metallic carbon nanotube has a linear energy band dispersion E(k ) = h¯ v F k with a spin degeneracy gs = 2 and valley degeneracy gv = 2, and a characteristic band velocity v F ∼ 8 × 107 cm/s. The linear dispersion implies the group velocity of each k-state 2g g dE(k ) is v(k ) = 1h¯ dk = v F . Since the DOS is = hvs v , the quantum capaciF
19 If the nanotube was to be used as a
nanoscale antenna to radiate the energy as electromagnetic waves into the surrounding, the impedance mismatch is a measure of how difficult it is for it to radiate (or, because of reciprocity, detect).
q2
8q2
tance per unit length is given by Cq = q2 · DOS = gs gv h h 2M v i = hv F , which evaluates to Cq ≈ 386 aF/µm. The kinetic inductance similarly 2 is Lk = g gh q2 h Mv i = 2q2hv , which evaluates to Lk ≈ 16 nH/µm. This s v F implies the internal transmission line ”quantum” wave impedance at q q Lk 1 high frequencies is Zq = Cq = Gq = Rq = gs gv h2 ≈ 6475 Ω. The complex impedances dominate as f ∼ THz, highlighting a mismatch between the wave impedance in the ballistic nanotube with that of electromagnetic waves in vacuum19 . The quantum capacitance and kinetic inductance of the metallic nanotube are therefore not negligible. It is important to account for the quantum capacitance in electrostatics, and both the quantum capacitance and kinetic inductance at high frequencies. In Section 17.2, we argued that for long conductors that are many electron mean free paths L >> λm f p , one can neglect the role of the quantized conductance. Let us identify the situations when the quantum capacitance and kinetic inductance effects may be neglected. From the mode-picture (see Exercises 17.3 and 17.4), we can write the 1D MG per unit length quantities Cq ∼ hviq and Lk ∼ MG1 hvi . Since the 1D q classical capacitance per unit length is Ccl ≈ 2πe0 er , and the total ca−1 pacitance is Ctot = Ccl−1 + Cq−1 , we can neglect Cq if it is much larger than Ccl , which is achieved if Cq 2q2 M 1 4πe0 h¯ c hvi πer = · >> 1 =⇒ M >> · · ∼ er . Ccl hv 2πe0 er c 2 q2 | {z } 1 α ∼137
Fig. 17.16 Quantum limit of a RLC transmission line and its interface with the electromagnetic spectrum.
(17.31) 1 Here c is the speed of light, and α = 137 is the fine structure constant. The above analysis provides a simple rule of thumb: if the number of modes in the conductor exceeds roughly the relative dielectric constant of the medium surrounding the conductor, then the quantum capacitance may be neglected for the 1D conductor. For er ∼ 4, roughly 10s
17.6
Negative R, L, and C 393
of modes will make the classical capacitance dominate. The cases for 2D or 3D conductors should be evaluated separately since the electrostatic capacitance depends on the geometry. Since the classical and kinetic inductances add, Lk should be much smaller than Lcl to be negligible. For the kinetic inductance to be negligible, a far higher number of modes is necessary. This is because the classical 1D capacitance and inductance per unit length of a transmission line are related via Lcl Ccl = c12 , whereas Cq Lk ≈ hv1i2 , implying Cq Lcl Ccl h v i2 L ∼ 2 ∼ 10−4 =⇒ k > 1 if > 104 . Lk Cq Lcl Ccl c
(17.32)
From the discussion of the quantum capacitance, this implies a very large number of modes, exceeding thousands are necessary. In other words, the kinetic inductance is not negligible for nanoscale conductors and neglecting it would lead to significant errors in understanding and designing electronic devices that operate at high frequencies. The frequency range of interest can also be gauged by comparing the impedance due to kinetic inductance ωLk with the resistance R. Using this limit, we can write (note that R = Lk · τ):
ωLk = 2π f ·
m? l m? l 1 · ] > R = · ] =⇒ f > , 2πτ q2 n A q2 nτ A
(17.33)
where τ ∼ 10−12 s is a typical scattering time. This analysis indicates that the kinetic inductance is important for high frequencies, beyond several 100s of GHz.
17.6 Negative R, L, and C We have discussed the quantum limits of R, L, and C in the last three sections. An interesting question is: can the R, L, and C be negative? If the conductor is not generating energy, the answer is no20 . Because if any of these three quantities were negative, from dU/dt = I 2 R, U = 12 LI 2 , and U = 12 CV 2 , one can generate energy out of nowhere, violating energy conservation. On the other hand, the time-dependent, or complex impedance of each of these three elements in response to a driving current Iω eiωt or voltage Vω eiωt are ZR = R, ZL = iωL, and 1 ZC = iωC , where ω is the frequency. The complex portions of the impedances are not negative, but imaginary with a transparent physical meaning: the voltage drop across an inductor and a capacitor is out of phase with the current by either +π/2 or −π/2, whereas it is in phase with the current across a resistor. Though devices do not have negative DC resistance, inductance, and capacitance, they can have negative differential DC values of these three parameters. The negative differential values are not of purely academic interest, but have immense practical applications. For example, though a standard (positive) resistor cannot amplify electrical signals, if a device exhibits
20 For example a solar cell under illumi-
nation generates energy, and the I-V curve goes into the fourth quadrant, making I · V negative (see Chapter 28). A simple R = V/I will indeed give a negative resistance indicating that energy is being generated. Similarly, a thermoelectric device with a temperature gradient across it can generate energy from heat, as can a capacitor with metal plates that undergo radioactive decay that harvests nuclear energy. We do not consider these situations here.
394 Game of Modes: Quantized R, L, and C
Fig. 17.17 Current-voltage oscillations observed by J. B. Gunn in n-type GaAs (from Sol. St. Comm., 1, 88, 1963).
NDR, it can be used in a feedback circuit to amplify or boost electrical signals. Negative Differential Resistance (NDR): A strange property of the semiconductor GaAs was uncovered in 1962 by J. B. Gunn. He observed that when a lightly n-type doped semiconductor GaAs was contacted by heavily doped regions, and a DC current was passed through it, at certain DC currents the voltage drop across the device started oscillating with time, as seen in Fig. 17.17. Now a DC current I = qnvA, where n is the 3D electron density, v = µF is the drift velocity in response to an electric field F with electron mobility µ ordinarily should lead to a single value of a DC voltage across a standard resistor. The reason for the oscillatory output was soon tracked down by several researchers to the existence of a negative differential resistance, or NDR, originating in the electron bandstructure of GaAs. Fig. 17.18 shows the full bandstructure of GaAs calculated by the empirical pseudopotential method in Chapter 13. An interesting feature of the GaAs bandstructure is that the L-valley minimum is very close in energy to the Γ-valley minimum: the separation of the two valley minima is EL − EΓ = EL−Γ = 0.31 eV, much smaller than the bandgap Eg = 1.42 eV. Also note that the L-valley minimum has a much smaller curvature than the Γ-valley minimum: indeed, the effective mass of the Γ valley minimum is m?Γ = 0.067me , whereas that of the L-valley is m?L = 0.55me , i.e., roughly 8X heavier than the Γ-valley. Finally, since the L-valley occurs at the BZ edge in the 111 direction, there are 8 such equivalent valleys, of which only half are in the 1st BZ, making the L-valley degeneracy gvL = 4 compared to gvΓ = 1. The electrons at the bottom of the Γ-valley placed there by light donor doping are accelerated by an electric field via the standard rate k − k0 ? equation F = h¯ dk dt − τk . Since mΓ is low, the change in E ( k ) is rapid, and with a relatively small electric field of ∼ 3 kV/cm, a significant population of electrons end up in the L-valley as shown schematically in Fig. 17.18. Because the effective mass in the L-valley is much larger, the electrons there have a much smaller drift velocity. The two carrier populations result in a decrease in the net velocity with an increase in the electric field, causing the negative differential resistance. A simple model can capture this physics; the bottom panel of Fig. 17.18 shows a calculated velocity-field curve for GaAs. Assuming a lattice temperature of TL , an electric field F causes the electrons to go out of equilibrium from the lattice, to an effective electron temperature Te . This non-equilibrium condition is captured by the name hot electrons. The electron concentration in the Γ-valley in the non-degenerate 3d · exp [−( E − E ) /k T ], and in the L-valley is situation is nΓ = NcΓ Γ F b e 3d · exp [−( E − E ) /k T ]. The ratio of electron densities in n L = NcL L F b e the L-valley to Γ-valley is −
EL − EF
N 3d e kb Te nL gvL m?L 3 − ELk −TEe Γ = cL · = ·( )2 · e b . E − E 3d nΓ gvΓ m?Γ − Γk Te F NcΓ e b
(17.34)
17.6
Negative R, L, and C 395
By energy conservation, the rate of energy gain by electrons from the field must equal the rate at which the electrons lose their kinetic energy. The electrons lose their kinetic energy by colliding with the atoms of the crystal, setting off lattice vibrations, or phonons. This physics will be discussed in Chapter 22. Let the energy relaxation time be τE , which is typically of the order 10−12 s, or 1 ps. The total kinetic energy of electrons moving in 3D is E = 32 k b Te for the hot electrons, and E0 = 32 k b TL when they are in equilibrium at F = 0. Then, the energy balance equation dictates that at steady state, we must have dE E − E0 = qFv − dt τE
|=⇒ {z }
qFv =
steady state
3 k b ( Te − TL ) · 2 τE
=⇒ Te = TL +
2 qτE · vF. 3 kb
(17.35)
The net current density due to electrons in both valleys is J = q(nΓ vΓ + n L v L ) = q(nΓ + n L )v =⇒ nΓ vΓ + n L v L µΓ F µΓ F v= ≈ , (17.36) nL = E −E |{z} nΓ + n L 1 + nΓ − Lk Te Γ m?L 3 gvL b 2 1 + gvΓ · ( m? ) · e v L ≈0 Γ
where the ensemble velocity of electrons is expressed as a function of the bandstructure parameters, the driving electric field F, and the resulting electron temperature Te . We have made the reasonable approximation that the velocity of electrons in the L-valley is far smaller than in the Γ-valley, and neglected their contribution to the current, but retained their contribution to the net electron concentration. The electron temperature is related to the electric field F and the lattice temperature TL and the bandstructure via Te = TL +
2 · 3
qτE µΓ F2 k b [1 +
gvL gvΓ
m?
3
· ( m?L ) 2 · e Γ
−
EL − EΓ k b Te
.
(17.37)
]
This equation is numerically solved to obtain the self-consistent electron temperature Te for a given electric field F and lattice temperature TL . Then, the velocity-field curve is obtained for a TL by varying the electric field using Equation 17.36, using an experimental electron mobility µΓ ∼ 9400 · (300/TL ) cm2 /V·s where TL is the lattice temperature in Kelvin. The bottom panel of Fig. 17.18 shows the resulting electron velocity as a function of the electric field, and the fraction n L /(nΓ + n L ) of the electrons that are in the L-valley. For fields less than ∼ 2 kV/cm, the velocity increases sharply, but because of the transfer of electrons to the high effective mass L-valley, the net velocity decreases for electric fields higher than F ∼ 3 kV/cm. As a result, the current decreases with increasing voltage: this is the NDR. Cooling the device enhances the NDR. Note that there is no negative resistance, only negative differential resistance: V/I is always positive unless the device is generating energy.
Fig. 17.18 Physics of the negative differential resistance (NDR) observed in the Gunn effect, and its relation to the bandstructure of GaAs. The transfer of electrons from a high velocity valley to a low velocity valley (or the transferred electron effect) is responsible for the NDR. Note that the NDR is reached at a relatively low electric field of ∼ 3 kV/cm, which makes Gunn oscillators useful for compact and effective sources of microwave power.
396 Game of Modes: Quantized R, L, and C
21 Because the Gunn oscillator is a com-
pact source of microwave power at high frequencies (∼ 1–100 GHz), and it produces the microwave power at a low voltage input with respectable conversion efficiency, it is used in radar detectors, collision avoidance radar, movement sensors, and several related applications.
Fig. 17.19 Negative Differential Capacitance.
22 Because of their hysteresis in their
properties, both ferromagnets and ferroelectrics are used for memory elements.
If a fixed DC current is driven in the NDR regime, there are two stable voltage values. Local fluctuations in the carrier concentration set up oscillations between the allowed voltage points. Thus, the simple device is able to convert a portion of the DC power I · V supplied to it into AC power ∆I · ∆V, where ∆I and ∆V are the margins of oscillations. The frequency of these oscillations are limited by the Γ → L intervalley scattering rate, to be discussed in Chapter 23. Typical frequencies achieved are in the 1–100 GHz range, making the Gunn oscillator an effective microwave power source21 . The two-terminal device is connected to metal antennas of appropriate size to radiate and receive microwave signals. Gunn oscillations are typically observed in bulk semiconductors that have bandstructure traits similar to GaAs (such as InP, etc.), and is a beautiful example of the immense potential hidden in the details of the E(k) diagram of crystals, stemming from the quantum mechanics of electrons in periodic potentials. NDR in the current-voltage characteristics is also obtained in heavily doped pn junction diodes (due to Esaki tunneling) – to be covered in Chapters 24, and in heterostructure resonant tunneling diodes, also be covered in Chapter 24. But the simplicity of construction and operation of the Gunn oscillator, which needs just a bulk semiconductor is the reason for its enduring popularity. Negative Differential Capacitance and Inductance: Unlike the resistor which dissipates energy, a capacitor C and an inductor L can store energy. When electron spins interact strongly with each other and switch collectively, the solid can become ferromagnetic. Ferromagnetic materials are used in inductor cores to boost their energy storage capacity. Similarly, when electric dipoles interact strongly with each other and switch collectively, the material becomes ferroelectric. Both ferromagnetic and ferroelectric materials are characterized by a response function that exhibits hysteresis. For a ferromagnet, the net magnetization is hysteretic with the external magnetic field, whereas for a ferroelectric material the net electrical polarization, or charge Q is hysteretic with the external voltage V, or the electric field. Fig. 17.19 indicates a ferroelectric energy vs. charge curve for a parallel plate ferroelectric capacitor, and the corresponding hysteretic loop of charge in response to the voltage, compared to a standard (non-ferroelectric) dielectric. Because the ferroelectric has two lowest energy states, poled ”up” and ”down” (similar to N and S of a ferromagnet), the minima of energy U vs. Q has a hump, with a negative curvature region around the origin. This is somewhat similar to the effective mass in electronic bandstructure: therefore, it is believed that a ferroelectric capacitor can potentially provide a DC negative differential capacitance. Similarly, a DC negative differential inductance can also be expected in an inductor. These properties are currently under investigation. Since some semiconductors can be ferroelectric, and some can also be ferromagnetic, negative differential capacitance and inductance effects in DC could prove to be a useful for new generations of device applications to add to negative differential resistance22 .
17.7 Chapter summary section 397
The concepts of quantized conductance, capacitance, and inductance are important in the field of circuit quantum electrodynamics, which provides the conceptual framework for the design of quantum bits or qubits, the workhorse for quantum computation. In such circuits, electromagnetic energy is manipulated in the form of a single photon.
17.7 Chapter summary section In this chapter, we learned:
• How to view conductance, capacitance, and inductance of conductors from the point of view of their bandstructure and electronic modes. • The mechanism of ballistic transport, and how to move smoothly between the ballistic regime to the scattering dominated regimes using the powerful concept of quasi-Fermi levels. • The concepts of quantum capacitance and kinetic inductance, which assume increasing importance in nanoscale conductors, and at high frequencies. • How negative differential resistance can be achieved in bulk semiconductors exploiting quirks of the electron bandstructure.
Further reading Quantum Transport: Atom to Transistor, and Lecture Notes in Nanoelectronics by S. Datta provide lucid introductions to the concepts, and many facets of quantized conductance and quantum capacitance, and are highly recommended as further reading. Several concepts discussed in this chapter are topics of current research and
I expect them to undergo refinements with new experiments. Aspects of kinetic inductance and quantum capacitance that are currently unknown are discussed in the review article Nanoelectromagnetics: Circuit and Electromagnetic Properties of Carbon Nanotubes by Rutherglen and Burke, in Small (2009), volume 5, page 884.
Exercises (17.1) Quantum current flow and saturation in semiconductors We discussed the carrier density in semiconductor bands and the ballistic current flowing in them
for transport in d = 1, 2, 3 dimensions in Chapter 5. (a) Consider the 1D case, encountered in semiconductor quantum wires and carbon nanotubes. Use
398 Exercises a conduction band effective mass of m?c = 0.2me , valley degeneracy gv = 1 and spin degeneracy gs = 2. Calculate and plot the source quasi-Fermi level at the source injection point ( EFs − Ec ), and the drain quasi Fermi level ( EFs − qV − Ec ) as a function of the voltage 0 < V < 2 Volt for a high electron density n1d = 5 × 106 /cm at room temperature T = 300 K, and explain the plot. Next, plot the ballistic currents vs. voltage for electron densities n1d = 1, 2, 3, 4, 5 × 106 /cm, both at 300 K and at 77 K for 0 < V < 2 Volt. You should observe that in 1D semiconductors the ballistic current does not depend on the 1D density at low voltages and low temperatures – why is this so? Why does the current saturate at high voltages? (b) Now consider the 2D case, which is encountered in silicon transistors, III-V quantum well high-electron mobility transistors, and 2D crystal semiconductors and metals. For this problem, use a conduction band effective mass of m?c = 0.2me , valley degeneracy gv = 1 and spin degeneracy gs = 2. Calculate and plot the source quasi-Fermi level at the source injection point ( EFs − Ec ), and the drain quasi Fermi level ( EFs − qV − Ec ) as a function of the voltage 0 < V < 2 Volt for a high electron density n2d = 5 × 1013 /cm2 at room temperature T = 300 K, and explain the plot. Next, plot the ballistic current per unit width vs. voltage for electron densities n1d = 1, 2, 3, 4, 5 × 1013 /cm2 , both at 300 K and at 77 K for 0 < V < 2 Volt. You should observe that unlike in 1D semiconductors, the ballistic current per unit width in 2D does depend on the 2D density at low voltages and low temperatures – why is this so? Why does the current saturate at high voltages? In this problem, you have solved the 1D and 2D ballistic transistor problem in disguise. The variation of the carrier density in a transistor is done with a third terminal called the gate. We will use the same method to understand the behavior of the ballistic field-effect transistor (FET) in Chapter 19. (17.2) A 2DEG as a parallel array of 1D conductors Electrons of sheet carrier density ns sit in the conduction band of a 2D electron system of energy h¯ 2 2 2 bandstructure E(k x , k y ) = 2m ? ( k x + k y ) with the c k-space occupation of carriers shown in Chapter 5, Fig. 5.14. Assume a spin degeneracy of gs = 2 and a valley degeneracy of gv = 1. The width of the 2D system is W, the length L, and ohmic source and drain contacts are made to connect to
the electrons to flow a current in the x-direction. Solve this problem entirely at T = 0 K. The allowed discrete points in the k-space (k x , k y ) = 2π ( 2π L n x , W ny ) where ( n x , ny ) are integers are are considered individual modes of the 2DEG as indicated in Fig. ??. The collection of modes with the same ny is considered a 1D mode of the 2DEG. (a) When the applied voltage across the source/drain contacts is Vds = 0, find the Fermi wavevector k0 as shown in the left of Fig. 5.14. (b) Show that the number of 1D modes with current flow in the x-direction because of the finite width of the 2D conductor is M0 = k0πW . Use part (a) to write this in terms of the 2DEG density. (c) Now a voltage Vds is applied across the drain and the source such that the net sheet carrier density of the 2DEG does not change. Assume ballistic and show that 5.14, q transport q in Fig. m?c m?c 2 2 k R = k0 + 2 (qVds ) and k L = k0 − 2 (qVds ). h¯
h¯
(d) Show that the voltage Vds reduces the total number of left going modes M L and increases the total number of right going modes MR . Find expressions for M L and MR . (e) Find the voltage Vds at which carriers in all modes move to the right and no carriers move to the left. (f) Find how many right-going 1D modes are present in the above situation when all carriers move to the right. (g) Because each 1D mode in the ballistic limit can provide the maximum conductance of a quang g q2 tum of conductance G = s hv , find the ”saturation” current Id when the critical Vds of part (e) is reached. (17.3) Quantum capacitance as a function of modes Show that the quantum capacitance derived in 17.11 may be equivalently reformulated as q2 2M h i, (17.38) h v where M is the mode number, v is the group velocf ity of state k, and the average h f i = ∑∑k is a sum k over k-states. Cq =
(17.4) Kinetic inductance as a function of modes Show that the quantum capacitance derived in
Exercises 399 17.11 may be equivalently reformulated as Lk =
h 1 h i, q2 2Mv
0 K is (17.39)
where M is the mode number, v is the group velocf ity of state k, and the average h f i = ∑∑k is a sum k over k-states. (17.5) Kinetic inductance of graphene Show that the quantum inductance qof 2D graphene of bandstructure E(k x , k y ) = h¯ v F
h 1 · , (17.40) 2q2 v F k F W where h is Planck’s constant, k F is the Fermi wavevector, defining the Fermi level EF = h¯ v F k F , and W is the width. Lk =
k2x + k2y at T →
(17.6) Filling in the missing quantized R, L, and C Fill in the missing terms and complete Table 17.1 by analogy to the examples provided throughout this chapter. Also compare and contrast how the Rq , Lk , and Cq change with dimensions and bandstrutures.
Junction Magic: Schottky, pn and Bipolar Transistors At the heart of every semiconductor device is a Schottky diode, or a pn junction diode in one form or another. A clear understanding of the physics of their operation is essential for the analysis, and design of modern electronic and photonic devices. The majestic range of applications of the Schottky and pn junction diodes arise from the physics of electron transport in them. Unlike the resistors, capacitors, and inductors of Chapter 17, controlling the transport of electrons or holes in diodes enables us to engineer highly non-linear (and even negative differential) resistors, and transistors leading to electronic gain. In Chapter 15, we discussed the equilibrium energy band diagrams of these diodes. In this chapter, we will understand:
• Ballistic electron transport in Schottky diodes and the resulting current-voltage characteristics, • How the capacitive behavior of Schottky diodes are at the root of field-effect transistors, • Electron and hole transport in pn junction diodes, and • How the transport physics of the pn diode is responsible for electronic gain, bipolar transistors, solar cells, LEDs and lasers.
18.1 Ballistic Schottky diodes Historically, the Schottky diode is the first semiconductor device in which unidirectional, or non-reciprocal current flow was observed. Metalsemiconductor rectifiers were investigated as early as mid 1880s by Braun (Fig. 18.1), and used by Bose (Fig. 18.2) to demonstrate usage in short wavelength (high frequency) radio receivers, followed by Marconi’s entrepreneurship that gave birth to radio communications. All this happened well before the physics of metals and semiconductors were understood through band theory! But the understanding of the physics of the device after the revelations of the quantum mechanical theory of metals and semiconductors as we have discussed in Modules I and II in this book, propelled them into a class of very useful electronic devices via several breakthroughs. Initially the unreliability of the cat’s whiskers was removed, soon leading to power rectifiers, transistors, and much more.
18 18.1 Ballistic Schottky diodes
401
18.2 pn diodes: discovery
408
18.3 pn diodes: transport
411
18.4 Bipolar junction transistors
417
18.5 Deathniums!
423
18.6 Chapter summary section
424
Further reading
426
Exercises
426
Fig. 18.1 Ferdinand Braun in 1875 used metal-semiconductor point contact rectifiers. For his development of phased array antennas, he was awarded the 1909 Nobel Prize in Physics, with Marconi. Note that the electron was yet to be discovered, and semiconductors with the associated quantum mechanics and solid state physics were half a century away.
Fig. 18.2 Jagadish Bose demonstrated the use of metal-semiconductor contacts for the earliest mm-wave radio receivers in 1890s.
402 Junction Magic: Schottky, pn and Bipolar Transistors
+ + +
+
+
+
+ + +
+ +
+ + +
+ +
Fig. 18.3 Biased Schottky diode: chargeelectric field-energy band diagrams.
In Chapter 15, Fig. 15.11 we have discussed the equilibrium energy band diagram and electrostatics of the Schottky diode. The major conclusions were: due to the built-in voltage from the difference in the work functions qVbi = EFm − EFs at equilibrium, charge transfer occurs between the metal and the semiconductor. For a rectifying contact, the n-type semiconductor donor-doped at a density ND loses mobile pcarriers to the metal, forming a depletion region of thickness xd = 2es Vbi /qND at the junction. There is a potential barrier formed for electrons to move from the conduction band of the semiconductor into the metal due to the depletion region. That is our point of departure for discussing how the application of a voltage bias leads to a rectifying current. Let us for a moment neglect the flow of current, and focus on how the depletion region, and the barrier height changes upon applying a voltage bias across the Schottky diode. This is shown in Fig. 18.3. A battery of voltage V is now connected across the contacts. For clarity of the following discussion, we will assume that the metal is grounded, and the battery is connected to the semiconductor through an ohmic contact far from the junction, as indicated in the energy band diagram of Fig. 18.3. This means the Fermi level of the semiconductor EFs is controlled by the voltage of the battery. The equilibrium case is when V = 0, and EFm = EFs . Reverse Bias: Application of a -ve voltage V < 0 on the terminal (called reverse bias) connected to the metal is equivalent to applying a +ve voltage at the terminal connected to the semiconductor, if the metal were grounded. This must charge the semiconductor more positive than it is at equilibrium. Since p xd to go to the semiconductor has increased from qVbi → q(Vbi − V ) since V is negative. The strength of the electric field in response to the voltage also increases. Note that since the slope of dF ( x )
qN +
the electric field by Gauss’s law is dx = − esD , it does not change in shape inside the depletion region if the doping density is uniform. Thus the electric field profile F ( x ) in Fig. 18.3 changes as similar triR angles. The change in the area under the electric field curve F ( x )dx is simply the applied voltage V, leading to a surface field s 0 qND xd 2qND (Vbi − V ) Fmax = = . (18.2) es es Forward Bias: Now if a +ve voltage V > 0 is applied on the battery, EFs must increase, and the semiconductor should get charged more negatively than in equilibrium. The battery achieves this by pulling some electrons from the metal through the external circuit and pumping them into the semiconductor through the ohmic contact. The expressions for the depletion region thickness, the depletion capacitance, and the maximum surface electric field Fmax in Equations 18.1 and 18.2 remain unchanged for V > 0. This means that the depletion region shrinks, and the surface electric field decreases at V > 0 than the equilibrium. Of high importance is the lowering of the potential barrier for electrons at the conduction band edge of the semiconductor near the depletion edge to q(Vbi − V ), which will now lead to an exponential increase in the current. Before significant current flows however, the above discussions show that the Schottky diode is a variable capacitor (or varactor) in its low-conductivity state. Fig. 18.5 shows the calculated depletion thickness, depletion capacitance, and the surface electric field for three different doping densities. The depletion thickness is smallest for the heaviest doping. The depletion capacitance decreases at negative (reverse bias) voltages as √ 1/ Vbi − V, and the corresponding surface electric field increases as the reverse bias voltage increases. For example, for a heavy doping of ND = 1018 /cm3 , the depletion thickness is xdepl ∼ 40 nm, the zero bias depletion capacitance is Cdepl =0.27 µF/cm2 , and the surface electric field is Fmax = 0.6 MV/cm. We will see shortly that a high surface electric field under reverse bias promotes the quantum mechanical tunneling of electrons through the bandgap. But first let us look at current flow in the absence of tunneling. Fig. 18.6 (a) shows a detailed energy-band diagram of the Schottky diode when a forward bias voltage V is applied. The quasi Fermi level of the bulk n-type semiconductor is raised by qV with respect to the metal Fermi level: EFs − EFm = qV. Note that the top of the barrier is shown as rounded compared to the sharp kind shown in gray, and discussed earlier. This is an important real effect, which we
Ballistic Schottky diodes 403
400
300
200
100
0 -10
-8
-6
-4
-2
-10
-8
-6
-4
-2
-10
-8
-6
-4
-2
0.5
0.4
0.3
0.2
0.1
0.0
1.0
0.8
0.6
0.4
0.2
0.0
Fig. 18.5 Schottky diode depletion thickness xd , depletion capacitance Cdepl , and surface electric field Fmax as a function of voltage V for ND = 1016 , 1017 , and 1017 /cm3 for a semiconductor of dielectric constant es = 10e0 and a M-S built-in voltage Vbi = 1 V.
404 Junction Magic: Schottky, pn and Bipolar Transistors 6
10 6
5 10
4 3
10 - 4
2 1
10 - 9
0 -1 - 0.5
10 -14
0.0
0.5
1.0
1.5
-0.5
0.0
0.5
1.0
1.5
Fig. 18.6 (a) Energy band diagram of a Schottky diode showing the image-force lowering of the barrier height, the electron distribution in k-space at the top of the barrier at forward bias, and the circuit diagram. (b) Linear scale current-voltage characteristics of Schottky diodes of two barrier heights qφb = 0.6 V and qφb = 1.0 V at 300 K and 100 K. (c) Log scale current-voltage characteristics of a Schottky diode of barrier height qφb = 1.0 V at three different temperatures. The calculations are for an isotropic, single-valley (gv = 1) conduction band minimum with an electron effective mass m?c = 0.2me .
now discuss to prepare for calculating the current. The band edge variation from the junction z = 0 without image force lowering (Fig. 18.7) varies quadratically from Poisson equation:
1.0
0.5
0.0
-0.5
-1.0
qND z2d q2 ND (z − zd )2 , where Vbi = , 2es 2es (18.3) where zd is the depletion thickness. Consider an electron a distance z from the junction, in the depletion region of the semiconductor, approaching the top of the barrier. The free–electrons in the metal rearrange to create an image charge of the electron of positive sign located at a point −z inside the metal. As a result, the electron experiences an attraction Vimg (z) = −q2 /4πes (2z). Adding this image potential to Ec (z) shifts the barrier maximum from the junction to a point slightly away from the junction inside the semiconductor at zm . The top of the barrier is lowered by a finite amount, and the new barrier height is what we re-label as qφb as also shown in Fig. 18.6 (a)1 . We now evaluate the current in the full ballistic approximation by neglecting tunneling. The quantum mechanical current density is given, qg g as always, by J = Ls3 v ∑k v g (k) · f (k ) · T (k) where gs , gv are the spin and valley degeneracies, v g (k) is the group velocity of state k and f (k ) its occupation function, and T (k) is the transmission probability. A very useful trick to calculate currents is to realize the obvious: due to current continuity, the current density J is the same at all planes z. Consider the plane z = zm in Fig. 18.6 (a) where the potential barrier reaches its maximum after image force lowering. In this plane, the electrons in the conduction band moving to the left towards the metal must have come from the bulk of the semiEc (z) = q(φb − Vbi − V ) +
-5
5
10
15
20
25
30
Fig. 18.7 Image force lowering of the Schottky barrier height. 1 The top of the barrier plane is very
close to the junction (zm /zd n2i , there are more carriers than thermal equilibrium, and recombination processes try to reduce their number towards n · p → n2i by removing them from the bands. If n · p < n2i , then generation processes create more interband carriers in an attempt to bring the system back to equilibrium. Consider the situation9 when n · p 1 and therefore n · p >> n2i , leading to recombination. Under reverse bias condition, qV/k b T Ln and Wn >> L p (see Fig. 18.15), let us consider the general case. The excess minority carrier concentration at a metal/semiconductor ohmic contact is zero10 . This property modifies the boundary conditions and the diode equation to: qV
∆n( x = 0) = n p0 (e kb T − 1) = A + B W
W
∆n( x = Wp ) = 0 = A · e+ Ln + B · e− Ln qV
=⇒ n( x ) − n p0 = n p0 (e kb T − 1) ·
Wp − x Ln ) Wp sinh( Ln )
sinh(
Wp − x
qV dn( x ) Dn cosh( Ln ) kb T =⇒ Jn ( x ) = qDn = −qn p0 · ( e − 1) dx Ln sinh( Wp )
Ln
=⇒ | Jtot | = q(n p0
di f f Jn ( x p )
Dn Ln · tanh
Wp Ln
= −qn p0 + pn0
Dn
Ln · tanh Dp
n L p · tanh W Lp
qV
Wp Ln
( e k b T − 1). qV
) · (e kb T − 1). (18.15)
This general expression of the diffusion current looks identical to Equation 18.11, except for the tanh(...) terms in the denominators. For long Wn and Wp , Wn /L p >> 1 and Wp /Ln >> 1, and since for x >> 1, tanh x → 1, we recover Equation 18.11. In the opposite limit, if the undepleted regions are much smaller than the respective minority carrier diffusion lengths Wn /L p NA , indicated by the sign n++ . The n-side is called the emitter, and the p-side is called the base, in anticipation of their roles in the bipolar transistor. The minority electron density injected from the emitter into the base is much larger than the minority hole density injected from the base into the emitter. The corresponding ratio of current densities is given in general by Equation 18.20, and for the specific case discussed here is simply ND /NA . For example, if the doping densities are ND = 1017 /cm3 and NA = 2 × 1015 /cm3 , Jn /J p = 50. This means the electron current at the depletion region edge is 50 times larger than the hole current. The ratio is maintained
Fig. 18.20 Making an electron emitter from a pn diode. In a n++ /p diode, the minority electron injection is much larger than the minority hole injection. The ratio Jn /J p is the root of electronic gain (or electronic amplification) in bipolar junction transistors.
418 Junction Magic: Schottky, pn and Bipolar Transistors
Fig. 18.21 The bipolar junction transistor. The carrier flow inside, the lower energy band diagram, and the minority carrier density profile inside the transistor is indicated for a forward-biased emitter/base junction, and a reverse-biased base-collector junction. The direction of current flow in the external wires is also indicated in response to the forward base/emitter voltage VBE and a reverse base-collector voltage VBC . The emitter and base thicknesses are much smaller than the respective minority carrier diffusion lengths in them.
over a large range of forward bias voltages. If the two components Jn and J p were accessible individually, then injecting say two holes from the base to the emitter will force the injection of 2 × 50 = 100 electrons from the emitter to the base. This is a 50X amplification of the signal, resulting purely from the way the diode is constructed. However, in a two-terminal diode, the two components Jn and J p cannot be accessed individually since the net current flowing outside the diode is always the sum of the two. Splitting the current components is achieved if a reverse biased pn-junction collector is merged with the n++ /p emitter diode, as shown in Fig. 18.21. It has now become a three-terminal device, a npn bipolar junction transistor. Let us discuss the operation of the device qualitatively before evaluating its quantitative characteristics. All metal contacts to the three semiconductor regions – the emitter, base, and collector are ohmic. When the emitter/base junction is forward biased with voltage VBE > 0, a large number of electrons are injected from the n++ emitter into the p-type base, where they are minority carriers. The thickness of the base is intentionally designed to be far smaller than the electron diffusion length: its undepleted thickness is WB > 1 =⇒ Current gain. IB Jp
(18.21)
A current IB flowing into the base terminal of the BJT forces an amplified current IC that flows into the collector terminal. The current gain IC /IB >> 1 is determined by the semiconductor properties and device geometry and not on the applied voltages or currents over a wide range, as we will shortly discuss. Note that the current flowing in the collector terminal IC becomes independent of VBC , it depends on VBE instead, again showing the transferred resistor effect, which is why it is called the transistor. Fig. 18.19 anticipated this property.
18.4 Bipolar junction transistors 419
Bipolar transistor performance boosters: short bases and heterojunctions: In modern bipolar transistors, the thickness of the undepleted emitter WE , and the undepleted base WB are much smaller than the respective minority carrier diffusion lengths in those layers (see Fig. 18.21). The collector depletion region WC is also kept short: these small dimensions make the device faster. In addition, we will now see that choosing the bandgap of the emitter EgE to be larger than that of the base EgB significantly boosts the transistor performance. With these modifications, the ratio of the electron current JnE injected by the emitter into the base, and the hole current J pE injected by the base into the emitter becomes n2 JnE D W N D W · NDE = nB · E · DE · 2iB = nB · E ·e J pE D pE WB NAB niE D pE WB · NAB
EgE − EgB kb T
,
(18.22)
where the diffusion constants and doping densities of the respective emitter/base regions are indicated by the subscripts. Note that since the bandgap difference appears in the exponential, in homojunctions this factor is unity, but in heterojunctions it is a dominant control parameter. For example, if EgE − EgB = 6k b T ∼ 0.156 eV, the JnE /J pE ratio is boosted by e6 ∼ 400 due to the exponential term. Bipolar transistors that use heterojunctions to boost their performance are sometimes called heterostructure bipolar transistors, or HBTs to distinguish them from BJTs. The emitter injection efficiency γE of the bipolar transistor is defined as the fraction of the emitter/base diode minority carrier current injected from the emitter into the base that is capable of reaching the collector (JnE in this case11 ) γE =
JnE JnE 1 = = = J pE JE JnE + J pE 1 + JnE
1 1+
D pE DnB
·
WB · NAB WE · NDE
·e
−
EgE − EgB
.
kb T
(18.23) A very small fraction of the minority electron current JnE injected into the base from the emitter may recombine in the base before making it to the collector. Let this base recombination current component be JrB , implying the current that makes it to the collector is JC = JnE − JrB . The base transport factor α T of the bipolar transistor is defined as the fraction of the electron current from the emitter that makes it through the base to the collector: αT =
JC J − JrB J = nE = 1 − rB . JnE JnE JnE
(18.24)
With these definitions, JC /JE = γE · α T . From Fig. 18.21 the currents of the three terminals follow IE = IC + IB . Assuming a uniform cross sectional area, this translates to JE = JC + JB . Then, the current gain β F of the bipolar transistor is defined as JC
J J JE γE · α T JE βF = C = C · = = . JC JB JE JE − JC 1 − γE · α T 1 − JE
(18.25)
11 γ is therefore a measure of the ”oneE
sidedness” or asymmetry of electron and hole minority currents of the emitterbase diode.
420 Junction Magic: Schottky, pn and Bipolar Transistors
Let us now evaluate the base transport factor α T = 1 − JrB /JnE . Since JnE = qn p0 DWnB (eqVBE /kb T − 1) where n p0 = n2iB /NAB , we must find the B recombination current in the base. To do so, we use Equation 18.12 R−G =
n · p − n2i
τ0 [n + p + 2ni cosh ( Ekt −TEi )]
.
(18.26)
b
From Fig. 18.21, we observe that inside the undepleted base region of thickness WB , n · p >> n2i , so we neglect the n2i term in numerator of the recombination rate. In this region, p >> n >> ni , so we neglect the n and ni terms in the denominator. Since τ0 = τn is the minority carrier lifetime, we get qVBE n p0 n( x ) x = · ( e k b T − 1) · (1 − ), τn WB τn WB Z Z WB qVBE qn p0 x = q dxR( x ) = · ( e k b T − 1) dx (1 − ) WB τn W 0 B
R( x ) ≈
=⇒ JrB
=⇒ JrB =
qVBE qn p0 WB QnB = · ( e k b T − 1) τn 2τn
J =⇒ α T = 1 − rB = 1 − JnE
=⇒ α T = 1 −
Fig. 18.22 A bipolar transistor biased in the common-emittter configuration and its symbol. The collector current JC and base current JB increase exponentially with VBE . The ratio β F = JC /JB is the current gain. The transistor characteristics IC as a function of VCE of the common-emitter circuit show current saturation, and its control with the base current IB . 12 The recombination current that occurs due to SRH processes inside the emitterbase depletion region goes as eqV/2kb T . See Exercise 18.7 to explore this further.
qn p0 WB 2τn
· (e
qVBE kb T
qn p0 DWnB · (e B
qVBE kb T
− 1) − 1)
WB2 W2 τtr = 1− = 1 − 2B . (18.27) 2DnB τn τn 2LnB
We have defined the base transit time for minority carriers as τtr = WB2 /2DnB , and identify QnB as the total minority carrier sheet density RW stored in the base QnB = 0 B dx · ∆n p ( x ). Note that the undesired base recombination current JrB has the same voltage dependence12 as the desired emitter injection current JnE . The ratio of the transit time to the lifetime appears explicitly. As an example, if WB = 100 nm and DnB = 100 cm2 /s, the transit time is τtr = WB2 /2DnB = 5 × 10−13 s, or 0.5 ps. Compare this to typical minority carrier lifetimes τn , which can range from ms – ns. Since τtr τtr . The fact that electronic gain is the ratio of the lifetime to transit time is a powerful general concept for minority carrier devices14 . Speed of bipolar transistors: The transistor gain we have evaluated is its value at the low-frequency, or DC limit. If the base current is varied rapidly in time at frequency f , the collector current is amplified only upto a certain f . The speed of the transistor is quantified by the unity gain cutoff frequency f T and the power gain cutoff frequency f max . If instead of a purely DC current, an additional AC current was injected into the base IB + ib cos(2π f t) of frequency f Hz, the collector current should be β F ( f )( IB + ib cos(2π f t)). However, β F ( f ) becomes frequency dependent, and drops with increasing f , reaching unity (= no gain) at the frequency f T . The electronic gain decreases
13 Note that if α = 1, the gain is β = T F
γE /(1 − γE ). Conversely, if γE = 1, the gain becomes β F = α T /(1 − α T ) = (2DnB τn /WB2 ) − 1 as seen below.
14 The meaning of this relation is the fol-
lowing: if we inject one hole into the base of the transistor, it must linger around for τn seconds before recombining with a minority electron. However, since a minority electron is whizzing past the base into the collector, it spends only τtr seconds in the base, where τtr > L. The all-important relation between the 2D electron sheet density ns
left-going carriers from drain
right-going carriers from source
Fig. 19.3 Energy band diagram from the source to drain under the MOS capacitor. The source contact injects and fills the right-going (k x , k y ) states with k x > 0, and the drain contact fills the leftgoing states with k x < 0 and brings the respective distributions to equilibrium with themselves by sharing the quasiFermi levels EFs and EFd respectively. In this situation EFs = EFd = 0. The transcendental equation relates the 2D mobile carrier density ns with the gate voltage Vgs of the MOS capacitor. 3 The allowed electron-wave states in the
semiconductor channel can occupy the 2π modes (k x , k y ) = (±n x 2π L , ± ny W ), just as the allowed modes of light in a microwave, or optical waveguide. Here n x , ny = 0, ±1, ±2, .... Each state has a characteristic group velocity v g = 1 h¯ ∇k E ( k x , k y ), and its occupation probability f (k x , k y ) is controlled by the contact(s) with which the state is in equilibrium. 4 The
value of the channel length L will not appear in ballistic transport, since there is no scattering in the transport from the source to the drain. However, L is rather important in scattering-dominated, long-channel transistors, where it will appear with the electron mobility. Transverse quantization and quantum-wire behavior occurs for very small W as discussed later in this chapter.
432 Zeroes and Ones: The Ballistic Transistor 5 Note that we are re-labeling N 2d = n q c
for this chapter, and calling it a ”quantum” sheet carrier concentration to simplify the notation for 2D channel FETs. The ”semi-circle” integrals of Equation 19.1 are simplified by converting to radial coordinates (k x , k y ) = k (cos θ, sin θ ). 10
2
10
1
10
10
-1
10
-2
10
-3
10
-4
10
-5
10
-6
10
-7
10
-8
10
-9
10
n s = gs · gv ·
=
gs gv (2π )2
Z +∞ −∞
dk y
Z +∞ gs gv
(2π )2
−∞
gs gv ∑k f (k ) = W·L WL
Z +∞ 0
dk y
-0.2
0.0
0.2
0.4
10
Z 0
−∞
Z
dk y 2π W
1 + exp [
dk x ·
· 1
Ec +
dk x 2π L
· f (k x , k y )
h¯ 2 (k2x +k2y ) − EFs 2m? c kb T
+ ]
1
1 + exp [
Ec +
h¯ 2 (k2x +k2y ) − EFd 2m? c kb T
]
1 · nq · [ F0 (ηs ) + F0 (ηd )], 2
(19.1)
where F0 (η ) = ln[1 + eη ] is the Fermi–Dirac integral of order zero, nq = Nc2d is the band-edge DOS of the semiconductor channel5 , and ηs = EFsk −TEc and ηd = EFdk −TEc . For the case Vgs = Vgd , EFs = EFd , using b b Equation 19.1 with the energy band diagram in Fig. 17.11, we obtain the central relation between the voltage Vgs on the gate metal, and the resulting mobile 2D carrier density ns induced by field effect in the semiconductor channel: ns
ns
e nb · ( e nq − 1) = e
8
Vgs −VT Vth
,
(19.2)
where Vth = k b T/q is the thermal voltage, the gate insulator barrier capacitance per unit area is Cb = eb /tb , and nb = Cb Vth /q is an effective 2D carrier density. From Section 17.3: the dependences of ns on Vgs in two limits of the gate voltage are:
6
4
ns ≈
2
0 -0.4
dk x ·
=⇒ ns =
-10
-0.4
and the gate voltage Vgs is shown boxed in Fig. 19.3. For a 2D semiconductor channel with a parabolic bandstructure, ns is given by
Cb Cq Vgs − VT ( ) for Vgs > VT , and Cb + Cq q ns ≈ nq e
-0.2
0.0
0.2
0.4
Fig. 19.4 The FET channel 2D electron gas density ns as a function of the gate voltage Vgs − VT , calculated at 300 K and 77 K as solutions of Equation 19.2 (solid lines), and Equations 19.3 (dashed lines). The band parameters used are an effective mass of m?c = 0.2me , spin and valley degeneracies gs = 2 and gv = 1. The insulator layer has a dielectric constant eb = 20e0 , and a thickness tb = 2 nm.
Vgs −VT Vth
for Vgs < VT .
(19.3)
For the sub-threshold condition Vgs < VT , the electron density in the semiconductor channel ns is exponentially dependent on Vgs , and the quasi-Fermi level of the semiconductor EFs = EFd < Ec is inside the bandgap of the semiconductor, populating it with very few electrons. On the other hand, if Vgs > VT , the Fermi level rises to EFs = EFd > Ec and the mobile carrier density is roughly linearly proportional to the excess gate voltage above threshold. Let us compare an example of the exact solution of Equation 19.2 with the limiting cases of substhreshold and above threshold conditions of Equation 19.3. Fig. 19.4 shows, in the logarithmic scale (top), and in the linear scale (bottom), how the overdrive gate voltage Vgs − VT controls the mobile 2D electron gas density ns in the semiconductor conduction band for a thin insulating dielectric. The solid
19.2
lines are the exact solutions of Equation 19.2, and the dashed lines are the subthreshold and on-state approximations of Equation 19.3. The mobile carrier density ns is changed at 300 K from the off state value of 5 × 105 /cm2 to ∼ 1013 /cm2 , nearly eight orders of magnitude when the gate voltage is swept over Vgs − VT = −0.4 Volt to +0.4 Volt, a small interval. This will be seen later to correspond to a similar order of magnitude change in the channel conductance. The typical on-state channel carrier concentration for most 2D FETs is ∼ 1013 /cm2 . It is clear that the approximations of Equation 19.3 separately reproduce the on-state of the channel when Vgs >> VT and it is very conductive, as well as the sub-threshold, or off-state of the channel when Vgs VT can be viewed as a gate-controlled metal-insulator transition: this is a characteristic of every FET. By connecting with several earlier chapters, especially the foundational material of Chapter 5, the general relationship of the mobile carrier density with source and drain quasi-Fermi levels for FET channels of d-dimensions is given by nd = 12 · Ncd · [ F d−2 (ηs ) + F d−2 (ηd )], where Fj (η ) is the 2 2 Fermi–Dirac integral of order j, ηs = EFd − Ec EFs − Ec k T , and ηd = k T . b
b
19.2 The ballistic FET With this understanding of the gate capacitor, we proceed to discuss the operation of the ballistic FET when a drain voltage Vds is applied to the drain ohmic contact. Fig. 19.5 (a) illustrates a field-effect transistor with a 2D conducting channel. The device is shown with two gates sandwiching a 2D crystal semiconductor channel, with a corresponding energy band diagram in the vertical direction shown in Fig. 19.5 (b). The two gates may have separate bias voltages, but we consider only the top gate bias in the following discussion8 . The energy band diagram and the k-space occupation of the 2D electron modes are shown in Fig. 19.5 (d) for Vds = 0, which is identical to the last section. Parts (e) and (f) of the figure show how the k-space occupation function changes at the source-injection point in response to the drain voltage. The source and drain Fermi levels EFs and EFd are related by the applied voltage EFs − EFd = qVds . The carriers injected from the source and drain contacts are indicated with different colors for the purpose of illustration. Since the drain is now biased, the local mobile electron concentration ns ( x ) must change under the gate in the channel from the source to the drain. We will account for this spatial variation in later sections, when we introduce scattering in the channel. However, for the ballistic FET, the device characteristics can be obtained by investigating the affairs at the source injection point, which is the region near the source where the slope of the conduction band edge is zero, called
8 The double-gate case is in Exercise 19.4.
434 Zeroes and Ones: The Ballistic Transistor
a 2D crystal channel
Source
Gate
b Gate Insulator
Drain
Gate
Gate
Fig. 19.5 A ballistic FET composed of a thin, 2D semiconductor channel is shown in (a), with the energy band diagram in the vertical z-direction shown in (b). The current that flows in the semiconductor channel Id is shown in (c) in response to the gate voltage Vgs at a fixed drain voltage Vds in the log scale. Note the similarity to Fig. 19.4. The energy band diagram in (d) shows the case for Vds = 0, and (e, f) are for drain voltages Vds > 0. The drain current Id in response to the drain voltage Vds is shown in (g), each curve corresponding to a different gate voltage Vgs . The drain current increases as a gate-controlled resistor for small Vds , and saturates at high Vds . The current-voltage characteristics hold for both ballistic and scattering dominated FETs. The k-space occupation functions depicted are only for the ballistic case, at the source injection point, which is the location where the conduction band reaches its maximum. This source injection point is also called the ”top of the barrier” in transistor device physics.
9 At a high drain voltage, the source loses
electrons and the drain gains them. To maintain charge neutrality, EFs − Ec increases to compensate for the lost electrons, and EFd − Ec decreases to account for those gained: this is shown schematically in Fig. 19.5 (f). Note that here Ec is the conduction-band edge at the source injection point. Because the source loses high-energy electrons, it undergoes evaporative cooling, whereas the drain is heated since the ballistic electrons lose their energy by relaxing to the Fermi level after entering the drain. Note that the drain current saturates at high drain voltages: this is because only the right-going semi-circle of occupied states survive, and the shape of occupied states does not change beyond a drain voltage. The saturated drain current only changes with gate voltage which controls the area of the right-going semi-circle of occupied states, as shown in Figures 19.5 (g, c, and f).
left-going carriers from drain
right-going carriers from source
d
c
Off
On
e
“all” carriers move to the right (saturation)
g
f
the ”top of the barrier”. Imagine being perched at this point, and counting electrons whizzing to the right and to the left. Because the transport is ballistic and there is no scattering, electrons that move to the right with k x > 0 must have begun their journey from the source contact. This includes electrons with non-zero k y : they are moving at an angle θ = tan−1 (k y /k x ) to the x-axis. All right-going electrons are in equilibrium with the source electrode, and hence their population is characterized by a quasi-Fermi level equal to the source: EFs . The electrons at this point moving to the left share the quasi-Fermi level of the drain. But because Vds > 0, it has become energetically much more difficult for the drain to inject carriers that make it to the source injection point, since it is at an energy qVds below the source Fermi level. Electrons injected from the drain that have energy lower than the top of the barrier are reflected back to the drain. Thus, there are much fewer electrons moving to the left at the source injection point, as indicated by a smaller left-going occupied state semi-circle compared to the larger right-going half circle9 as seen in Fig. 19.5 (e). The total area of the two semi-circles is proportional to the local 2D sheet electron density at the source injection point. At Vds = 0, the den-
19.2
sity ns (Vgs ) depends only on the gate voltage, and is given by Equation 19.2 everywhere in the channel. How does the area change with the drain voltage at the source injection point? For a well-designed FET, to an excellent approximation, the area does not change: it depends only on the gate voltage, and not on the drain voltage.10 As a consequence, the right-half circle must increase in area, and the lefthalf circle must shrink, in a manner that the area remains constant as shown in Figures 19.5 (e) and (f). The quasi Fermi levels EFs and EFd must adjust such that the drain-control relation EFs − EFd = qVds is met simultaneously for the drain control Equation 19.1, and Equation 19.2, which is the gate control equation. Defining qVds /k b T = vd , EFs − EFd = qVds is rewritten as ηs − ηd = vd . The gate control equation yields a unique 2D sheet density corresponding to the gate voltage denoted by ns (Vgs ). Rearranging the drain control equation reveals that it is a quadratic equation in disguise, with a solution: ns (Vgs ) =
=⇒ ηs = ln[
r
The ballistic FET 435
10 This approximation is the key to quan-
titatively obtaining the output characteristics of the ballistic FET in a simple way. The approximation can be removed to allow for a capacitive coupling between the drain and the source injection point, but we avoid it in the treatment here.
nq (ln[1 + eηs ] + ln[1 + eηd ]) 2
=⇒ (1 + eηs )(1 + eηs −vd ) = e (1 + evd )2 + 4evd (e
2ns (Vgs ) nq
2
2ns (Vgs ) nq
− 1) − (1 + e v d )
].
(19.4)
The boxed solution in Equation 19.4 is the quantitative dependence of the source quasi Fermi level ηs = ( EFs − Ec )/k b T on the gate voltage through the relation ns (Vgs ) in Equation 19.2, and the drain voltage through vd . The quantum current is now obtained by summing the currents carried by each occupied (k x , k y ) state:11 qgs gv [v g (k x , k y ) · xˆ ] · f (k x , k y ) L ∑ Z dk y dk x qgs gv = · 2π · [v g (k x , k y ) · xˆ ] · f (k x , k y ) 2π L | {z } L W Id =
=⇒
Z ∞ qgs gv h¯
Id = W (2π )2 m?c
k · dk
Z ∞ qgs gv h¯
(2π )2 m?c =⇒
Z +π 2 − π2
k · dk
h¯ k x /m?c
dθ ·
Z + 3π 2 + π2
k cos θ
2 2
1 + exp [
dθ ·
Ec + h¯2mk? − EFs c
kb T
k cos θ 2 2
1 + exp [
Ec + h¯2mk? − EFd c
kb T
]
Id q2 k T = · Nc1d · b ·[ F 1 (ηs ) − F 1 (ηs − vd )], (19.5) 2 2 W h q | {z } J02d
where Nc1d = gs gv (
]
−
2πm?c k b T 1 )2 h2
is a 1D band-edge DOS.
Fig. 19.6 Ballistic FET flowchart. 11 For a parabolic bandstructure, the
group velocity projected along the xdirection is v gx = h¯ k x /m?c . Using radial coordinates, (k x , k y ) = k (cos θ, sin θ ).
436 Zeroes and Ones: The Ballistic Transistor 1
10
10
10
8
8
6
6
4
4
2
2
10
-1
10
-2
10
-3
10
-4
10
-5
10
-6
10
-7
10
-8
10
-0.4
-0.2
0.0
0.2
0.4
0 -0.4
-0.2
0.0
0.2
0.4
0 0.0
0.2
0.4
0.6
0.8
Fig. 19.7 The current-voltage characteristics of a ballistic FET at 300 K with eb = 20e0 and tb = 2 nm. The semiconductor channel has a conduction band edge electron effective mass m?c = 0.2me , and a conduction band valley degeneracy gv = 1. The band properties are of the wide-bandgap semiconductor GaN. The effective mass is similar to silicon, which has larger gv , as discussed later.
The above relations, summarized in the flowchart in Fig. 19.6, provide the complete ballistic FET characteristics of interest. The ballistic FET current written as Id /W = J02d · [ F 1 (ηs ) − F 1 (ηs − vd )] = JR − JL , 2
12 Note that in Equation 19.5, the gate
length L from the source to drain does not appear. This is a characteristic of ballistic, quantum transport of electrons that do not suffer scattering in their motion in the channel. At T = 300 K, for m?c = 0.2me and gv = 1, the coefficient of the ballistic current density is J02d ∼ 0.2 mA/µm, a useful number to remember. Note also the current density coefficient J02d is in the form of the product of the quantum of conductance, the thermal voltage, and the band-edge DOS of a dimension lower than the channel carriers. The generalization of the ballistic current q2 k T for d-dimensions is Jd = h · Ncd−1 · bq · [ F d−1 (ηs ) − F d−1 (ηs − vd )]. Expressions 2 2 for all dimensions are listed in Fig. A.1 of the Appendix for easy reference.
2
where JR = J02d · F 1 (ηs ) and JL = J02d · F 1 (ηs − vd ), shows the net cur2 2 rent as the difference of the right-going and the left-going currents12 .
19.3 Ballistic I-V characteristics Fig. 19.7 shows the output characteristics of a ballistic FET calculated using the formalism developed in the last section. The procedure for the calculation is summarized in the flowchart shown in Fig. 19.6 as three primary equations, that of the gate control, drain control, and the ballistic FET output current as a function of the gate and drain voltages. The only non-analytical step is the transcendental gate control equation that relates the 2D sheet density ns (Vgs ) with the gate voltage, which is easily solved on a computer. By discussing a few features of the FET characteristics, we will recognize analytical approximations of the off-state and on-state characteristics, which will significantly aid our understanding without the need of the computer. Fig. 19.7 (a) and (b) show the drain current per unit width as a function of the excess gate voltage Vgs − VT in the log and linear scales respectively, for two values of drain voltages: Vds = 0.1 V and Vds = 0.4 V. The current for gate voltages below threshold (Vgs < VT ) decreases exponentially with the gate voltage. It has a weak dependence on the drain voltage, and is called the off-state of the FET, characterized by an exponential dependence on the gate voltage. The current for gate volt-
19.3
ages Vgs > VT is the on-state, which is characterized by a polynomial dependence on the gate voltage. Fig. 19.7 (c) shows the drain current per unit width as a function of the drain voltage for various Vgs . It is characterized by a roughly linear regime for Vds < Vgs − VT , and a saturated current for Vds > Vgs − VT , with a smooth transition across the Vds ≈ Vgs − VT contour, shown as a dashed line. This is a general characteristic for all FETs. Let us investigate the limiting regimes, for which we can obtain analytical forms. Off-state: The off-state of the FET was discussed in Equation 19.3; the solution to the gate-control equation is ns ≈ nq exp [(Vgs − VT )/Vth ], and since ηs > Vth , the FET is in the onstate. The ballistic current in this limit is14
e ηs ≈ e
Vgs −VT Vth
·
1 =⇒ 1 + e−vd
Ballistic I-V characteristics 437
13 We have used the Maxwell–Boltzmann
approximation of the Fermi–Dirac integrals in the equation for the current, since for η > Vth , since tanh ( x → ∞) = 1. The exponential dependence of the off-state drain current on the gate voltage is thus seen explicitly. 14 Note that in the on state, qn ≈ s
Cg (Vgs − VT ), where Cg = Cb Cq /(Cb + Cq ). The quasi Fermi level at the source injection point for the right-going carriers is ηs >> +1 since the gate has populated the channel with a lot of electrons. Let us consider the case when the drain voltage is Vds >> Vgs − VT , implying ηs − vd > +1, the Fermi–Dirac integral in the expression for the ballistic current approximates to its degenerate, or polynomial limit F1/2 (η ) ≈ η 3/2 /Γ(5/2), yielding the on-state ballistic current limit of Equation 19.7. Since Id is saturated, Vds does not appear explicitly. Performance: The highest on-current maximizes the FET performance. The on-state ballistic current has a non-monotonic dependence on the semiconductor band-edge effective mass m?c and valley degeneracy gv because of their appearance in both the numerator and the denominator of Equation 19.7. The maxima, inferred by taking a derivative of Equation 19.7, occurs when Cq ∼ Cb , i.e. when the semiconductor quantum capacitance is nearly matched to the barrier capacitance. As shown in Fig. 19.8, for a given Cb , the current is maximized for a semiconductor that meets specific m?c and gv . Fig. 19.8 indicates that for the chosen barrier, the ideal semiconductor channel with gv = 1 should have m?c ∼ 0.15me . For large effective masses, a lower valley Id 1 degeneracy is desirable since W |on ∼ m?c √ gv , whereas for low effective √ Id masses, a larger valley degeneracy is desired, since W |on ∼ gv m?c . Effective mass, valley, and spin degeneracies: Fig. 19.9 shows the effect of the band parameters of the semiconductor channel on the
10
9 8 7 6 5 4 3 2
1 0.01
2
4
6 8
0.1
2
4
6 8
1
Fig. 19.8 The calculated current in a ballistic FET at 300 K for semiconductor channels with varying valley degeneracies and band edge effective masses for Vgs − VT = 0.4 V and Vds = 0.4 V for barrier thickness tb = 2 nm and dielectric constant eb = 20e0 . At current saturation, the maximum current is achieved when the barrier capacitance and the quantum capacitance are nearly matched: Cb ≈ Cq . The actual current drives are typically lower from the calculated ballistic values due to contact resistances.
438 Zeroes and Ones: The Ballistic Transistor 10
8
6
4
2
0.0
0.2
0.4
0.6
0.8
0.0
0.2
0.4
0.6
0.8
0.0
0.2
0.4
0.6
0.8
0.0
0.2
0.4
0.6
0.8
0.0
0.2
0.4
0.6
0.8
Fig. 19.9 Effect of the semiconductor channel band parameters on ballistic FET Id − Vds characteristics for a gate barrier of dielectric constant eb = 20 and thickness tb = 2 nm at 300 K. More valleys reduce the drain voltage required to obtain current saturation.
15 A limiting case is obtained for 2D
graphene, which has a zero bandgap. Though the bandstructure of 2D graphene is linear, and not parabolic, evaluation of the ballistic characteristics may be obtained in a similar fashion as for the parabolic bands. Since the gap is zero, the on/off ratio of a FET made with a graphene channel is very small.
Fig. 19.10 Spin-valley splitting in the valence bands of transition-metal dichalcogenide 2D semiconductors such as WSe2 . If the Fermi level is between the 2 valence bands, there are spin-polarized hole pockets in the k-space, in which each state has a spin-degeneracy of gs = 1, and not 2. Spin-splitting occurs in conventional semiconductors in a magnetic field, and in the absence of external magnetic fields if the semiconductor is intrinsically magnetic, such as in dilute magnetic semiconductors.
characteristics of the ballistic FET for a thin, high-κ dielectric gate insulator. For semiconductors with small effective masses, a large valley degeneracy is seen to reduce the Vds required to achieve current saturation. In discussions on semiconductor bandstructure, especially in Chapter 12 on the k · p theory, we have seen that small effective masses at the band edges are related to small energy bandgaps. For sp3 bonded direct-bandgap semiconductors, we found a crude relationship of m?c ≈ Eg /20, where Eg is in eV. Though small effective mass semiconductors can provide high on-currents, it is typically difficult to turn them off effectively, since the Fermi level can move easily through the small bandgap to the undesired band, for example creating holes in a n-channel device. This leads to effects such as interband tunneling and impact ionization at high drain voltages, which make it difficult to obtain very large on-off ratios for FETs formed of very small bandgap and small-effective mass semiconductors15 . For most semiconductors, the spin degeneracy is gs = 2 for each energy eigenvalue E(k ). This implies the total number of electrons that can be fit into any k-state is gs gv = 2gv . There are some notable exceptions. For example, in some 2D semiconductor crystals such as monolayer WSe2 , there are gv = 2 inequivalent conduction and valence band valleys in the first Brillouin zone, called the K and K 0 valleys. Because of the crystal symmetries, the spin-states are split differently in different valleys. This is shown in Fig. 19.10. The valence bands in the K and K 0 valleys are split into two spin-polarized bands. If a p-channel FET is made with such a semiconductor, then the correct factor of gs gv must be used to obtain the ballistic current density. For example, if the Fermi level lies between the two spin-split valence bands, the spinvalley degeneracy factor is gs gv = 1 · gv = 2, and not 4. Injection velocity: The ballistic FET output characteristics that was shown in Fig. 19.7 (c) is re-plotted in the left of Fig. 19.11 to indicate three distinct phases of operation. For a drain voltage of Vds = 0.15 V,
19.3
7
3.0
6
2.5
5
2.0
4 1.5 3 1.0
2
0.5
1 0 0.0
0.2
0.4
0.6
0.0 -0.4
0.8
-0.2
0.0
0.2
0.4
0.6
the gate sweeps the device from the subthreshold region, through a saturation regime, into a linear regime along the dashed line. The corresponding phase diagram is shown in the middle for the nFET, and also a corresponding pFET as a function of the drain and gate voltages16 . When the drain voltage is increased, more right-going states are occupied, while not changing the total occupied area as shown in the k-space. This increases JR and reduces JL as seen in the output characteristics. JL is a reasonable fraction of the net current in the linear regime of operation. When Vds exceeds roughly Vgs − VT , JL becomes negligible, most of the current is by right-going states JR , and the current saturates. To connect with the classical notion of current being a product of charge and velocity, one can define an ensemble injection velocity vinj as simply the net current divided by the net sheet charge qninj at the source injection point via the relation17 q2
1d k b T Id /W h · Nc · q · [ F 12 ( ηs ) − F 12 ( ηs − vd )] vinj = = q·nq qninj 2 · [ F0 ( ηs ) + F0 ( ηs − vd )] s 2 kb T qV sub =⇒ vinj ≈ · · tanh( ds ) for Vgs < VT π m?c 2k b T
=⇒
sat vinj
h¯ ≈ ? mc
s
Cg · (Vgs − VT ) 128 · for Vgs > VT . 9πgs gv q
(19.8)
Fig. 19.11 shows vinj on the right for the ballistic FET of Fig. 19.7 for two temperatures18 , seen to be in the range of 107 cm/s. Surface potential and gate control: The electrostatic control of the mobile carrier density at the source-injection point by the voltage Vgs applied on the gate is critical to the switching efficiency, gain, and speed of the transistor. The energy band diagram at the top of Fig.
Ballistic I-V characteristics 439
Fig. 19.11 Left: Output characteristics of a ballistic FET from Fig. 19.7 (c) showing the phases of operation, the right-going ballistic current JR , and the left-going component JL at Vgs − VT = 0.3 V; the net current is Jtot = JR − JL . The k-space occupation functions at this gate voltage and Vds = 0.15 V are shown in the center, bottom plot. Center, top: Phase diagram of operation of the ballistic FET as a function of the gate and drain voltages for nFETs and pFETs, showing the subthreshold, saturation, and linear regimes. Right: The injection velocity vinj at the source injection point of the ballistic FET as a function of Vgs − VT at two temperatures calculated using Equation 19.8, demarcating the three phases of operation. 16 The k-space occupation functions at
the source injection point show that for a fixed gate voltage, at Vds = 0 V equal number of right-going and left-going states are occupied, leading to equal currents JR = JL = J0 , and net zero current. This current for Vgs − VT = 0.3 V is seen in the left figure to be J0 ≈ 1.6 mA/µm. 17 The approximation for the subthresh-
old regime is obtained using ηs > +1 when Fj (η ) ≈ η 3/2 /Γ(5/2): notice how often these approximations appear! 18 The subthreshold v is entirely therinj
mal, independent of the gate voltage, and also independent of the drain voltage for Vds >> Vth . On the other hand, vinj in the saturation and linear regimes are only weakly temperature-dependent, and controlled mostly by the gate voltage: its dependence on Vgs can be concluded from the k-space occupation of semi-circles. The injection velocity is useful because if the gate-controlled ninj is known, the current can be obtained in a simple fashion. However, since it is a weighted mean, one must exercise caution in attaching too much physical meaning to it. It is clear from the k-space image that there is a large spread in electron velocity vectors, some of them are actually going from the drain → source, and some at right angles, along the width of the FET! Exercise 19.6 discusses injection velocity further.
440 Zeroes and Ones: The Ballistic Transistor
19.12 shows how the voltage applied on the gate divides into two parts: a voltage drop Vb in the (oxide) barrier and a voltage drop Vs in the semiconductor, that goes into creating the mobile carriers in the channel and is typically referred to as the surface potential. It is desirable to have a significant fraction go into creating the mobile carriers that actually change the conductivity of the channel, whereas Vb is the portion that sustains the electric field-effect in the barrier which avoids carriers from flowing from the channel into the gate. Let us investigate how the choices of the barrier materials via eb and tb , and the semiconductor channel material via m?c and gv affect exactly how this voltage division occurs. Let qφb be the M-O barrier height, and ∆Ec the O-S conduction band qn offset. The field in the barrier is Fb = Vt b = e s . By following the energy b b in the loop from the metal Fermi level EFm to the semiconductor Fermi level EFs in a manner similar to Chapter 17, Section 17.3, we obtain for a sheet density ns in the semiconductor channel, qns qφb + q( )t − ∆Ec + ( EFs − Ec ) = qVgs eb b ns E − Ec + Fs , (19.9) qVT = (qφb − ∆Ec ) =⇒ (Vgs − VT ) = q Cb q |{z} | {z }
0.4 0.2 0.0 -0.2 -0.4 0.4 0.2 0.0 -0.2 -0.4 0.4 0.2 0.0 -0.2
Vb
-0.4 0.4 0.2 0.0 -0.2 -0.4 0.4 0.2 0.0 -0.2 -0.4 -0.4
-0.2
0.0
0.2
0.4
Fig. 19.12 The dependence of the surface potential of the semiconductor channel in a FET on various transistor design parameters at 300 K.
Vs
which shows the voltage division explicitly. By using the exact relation of ns with Vgs from Equation 19.2, the voltage division is quantitatively obtained, as shown in Fig. 19.12 (a-e) for several choices of the barrier and semiconductor channel. Panel (a) shows the division of voltage against the dashed line which is the Vs = Vgs − VT limit. We note that the entire gate voltage does transfer to the semiconductor channel under subthreshold conditions when Vgs > +1, and the carrier density is degenerate. The voltage dropped in the barrier at this gate voltage is Vb = 0.4 − 0.15 = 0.25 V. As tb is increased from 2 nm to 20 and 22 nm, Vs drops and Vb increases, since a thick barrier implies a larger voltage drop in the oxide, and consequently weaker control of the channel conductivity by the gate. Note that for a thick barrier, the carrier density is only weakly degenerate since Vs > 0 only for the highest Vgs . A thinner barrier is desirable for the best gate control, as long as the barrier can prevent
19.4
carrier flow between the gate metal and the semiconductor channel. The dependence of the barrier material dielectric constant eb is shown in panel (b), a higher value is desirable. Panel (c) shows the dependence of the band-edge effective mass of the semiconductor channel for a thin high-κ dielectric: a smaller effective mass implies a smaller density of states, which requires a larger Vs for the same carrier density. But for a thicker barrier of panel (d), the dependence on the effective mass is weaker since most of the voltage drops in the barrier. Panel (e) shows the effect of the valley degeneracy of the semiconductor channel for a thin, high-κ dielectric: a smaller valley degeneracy leads to a higher surface potential because of a smaller DOS19 .
19.4 Quantum wire ballistic FET In the last section, the ballistic FET characteristics were discussed for a semiconductor that had a 2-dimensional channel. The need for stronger gate control over the semiconductor channel, and for miniaturization and denser packing makes FETs in which the semiconductor channel is one-dimensional attractive. Fig. 19.13 shows a 1D quantumwire (or nanowire) FET. The quantum wire semiconductor channel is shown as a long cylinder20 of radius rw , around which an insulating (oxide) barrier of thickness tb is wrapped, followed by a gate metal that is wrapped around the oxide. The source and drain ohmic contacts are made to the ends of the wire. We have already identified in Chapter 5 the ballistic current for a d-dimensional channel to be q2 Jd = h · Ncd−1 · Vth · [ F d−1 (ηs ) − F d−1 (ηs − vd )], and the gate/drain 2 2 control equation of the charge at the source injection point as nd = d Nc 2 ( ηs ) + F d−2 ( ηs − v d )]. If we can relate the applied voltages Vgs 2 [ F d− 2 2 and Vds to the ηs , we can obtain the characteristics of the ballistic 1D FET using d = 1. The 1D barrier capacitance for the cylindrical FET structure is Cb = tb 2πe H b / ln (1 + rw ) in F/cm units. To obtain this apply Gauss’s law D · dS = Qencl inside the barrier layer at the source injection point21 . For a density of mobile 1D carriers in the channel of n1d , for a length L Gauss’s law requires eb · E(ρ) · 2πρ · L = q · n1d · L, which yields the electric field profile E(ρ) = qn1d /(2πeb ρ), a well-known result in electrostatics for a cylindrical capacitor. Note that unlike the 2D FET in which the electric field was constant in the oxide, for the quantum wire the radial electric field increases from the metal to the semiconductor: inside the oxide it is higher at the semiconductoroxide interface than the oxide-metal interface. The line-integral of the electric field in theR oxide layer in the radial direction gives the voltρ =r + t qn qn age drop Vb = − ρ=rww b dρ · E(ρ) = 2πe1d · ln ( rwr+w tb ) = C1d , where b
b
Cb = 2πeb / ln (1 + rtwb ) is the barrier capacitance per unit length. The voltage division between the barrier and the semiconductor thus leads
Quantum wire ballistic FET 441
19 The discussion here has been lim-
ited to parabolic bandstructure, but may be re-done for crystals with linear (conical) bandstructure, or other cases. The same theoretical formalism that applies to gapped semiconductors applies to zero-gap graphene, metals, normal and Weyl semimetals, and topological insulators: all that is necessary is the correct bandstructure E(k ) be used with the corresponding distribution functions. For gapped semiconductors that have a parabolic band-edge dispersion, if the Fermi level samples regions of the bandstructure that have significant non-parabolicity, minor changes to the model are necessary: instead of m?c , the actual bandstructure E(k ) must be used to re-derive the expressions for electrostatics and ballistic currents.
20 Quantum wire FETs with non-circular
cross-sections and multiple subbands are are also possible and may be treated numerically.
21 The displacement vector is D
= ˆ where ρ is the radial distance eb E(ρ)ρ, from the center of the quantum wire, ρˆ is the radial unit vector, E(ρ) is the radial electric field, and Qencl is the charge enclosed in the Gaussian volume.
442 Zeroes and Ones: The Ballistic Transistor
to the complete 1D ballistic FET characteristics: Vgs − VT q · n1d EFs − Ec = + Vth C V qV | {z } | b{z th} | {zth } vg
n1d =
40
-5
10
-6
10
=⇒ n1d =
30
-7
10
20
-9
10
I1d =
-10
10
10
-11
10
-12
10
-0.4
0.0
0.4
0 0.0
0.2
0.4
0.6
Fig. 19.13 A ballistic quantum wire FET in which the transport is onedimensional. The calculation is for a quantum wire radius rw = 1 nm and a insulating barrier thickness tb = 2 nm with a dielectric constant eb = 20. Note that even at 300K, the Id /Vds in the linear regime is a quantized conductance, GQ = 2q2 /h ∼ [12950Ω]−1 .
22 The first boxed relation in Equation
19.10 must be self-consistently solved to obtain the 1D carrier density n1d as a function of the gate voltage via v g = (Vgs − VT )/Vth , the drain voltage via vd = Vds /Vth , the barrier parameters via n1d b = Cb · Vth /q, and the semiconductor band parameters via the 1D band-edge effective Nc1d . Then, substituting ηs = v g − n1d /n1d b from the first equation in the last boxed equation yields the complete output characteristics of the ballistic quantum wire FET.
0.8
ηs
Nc1d [ F− 1 (ηs ) + F− 1 (ηs − vd )] 2 2 2
Nc1d n n [ F− 1 (v g − 1d ) + F− 1 (v g − 1d − vd )] 1d 2 2 2 nb n1d b
-8
10
n1d /n1d b
q2 k T 1 + e ηs · gs gv · b · ln ( ) h q 1 + e ηs − v d
1 + e ηs q2 ln ( ) ≈ [ η − ( η − v )] =⇒ I ≈ [ · gs gv ] · Vds . (19.10) s s d d 1 + e ηs − v d |h {z } | {z } ηs >>+1, and vd +1. Approximations similar to the 2D ballistic FET in the on- and off-states may be obtained from Equation 19.10 for the 1D ballistic FET: q2 · gs gv ] · Vds for Vds < Vgs − VT h gs gv h[Cg (Vgs − VT )]2 ≈ for Vds > Vgs − VT . 32qm?c lin I1d ≈[
sat I1d
(19.11)
Unlike the DOS of 3D and 2D √ parabolic bands, the 1D DOS decreases with increase in energy as 1/ E − Ec , and has a van Hove singularity at the band edge. As the gate voltage is increased, the barrier capacitance Cb = eb /tb does not change, but the semiconductor quasi-Fermi
19.5
The drift-diffusion FET 443
level EFs is raised above Ec . Thus the semiconductor quantum capacitance Cq ≈ q2 · DOS decreases, and beyond a certain gate voltage dominates in the total gate capacitance given by C1g = C1 + C1q . This quanb tum capacitance-dominated regime of on-state operation increases the gate coupling to the quantum wire channel since then Cg ≈ Cq . In 2D, the ballistic saturation current went as Id /W ∼ (Cg Vgs )3/2 , but for 1D it goes as Id ∼ (Cg Vgs )2 . Inp the regime in which quantum p capacitance dominates, Cg ≈ Cq ∼ 1/ Vgs , which implies Id ∼ ( Vgs )2 ∝ Vgs . This regime is of special interest in high-frequency amplifiers because ∂Id the transconductance, or gain gm = ∂V becomes independent of the g gate voltage, enabling a highly linear operation regime without the generation of unwanted harmonics. Gain of FETs are discussed further in Section 19.6. Unlike in the 2D case, the velocity of mobile states in a single mode 1D channel FET is directed solely along the source-drain axis. This directionality is responsible for more efficient transport and efficiency of current flow per channel charge, since they are not moving at angles to the source-drain axis, as a significant fraction of carriers do in the 2D case. The better transport and gate control is offset by higher source and drain contact resistances to the single, or very few 1D modes of the channel.
19.5 The drift-diffusion FET In the preceding sections of this chapter, we discussed the ballistic transistor as a limiting case of the FET when there is no scattering in the semiconductor channel. In this section we discuss the other limiting case, when the channel is much longer than the carrier mean free path between successive scattering events. Frequent scattering brings carriers into local equilibrium amongst themselves, rather than being in equilibrium with the source and drain contacts as was the case in the ballistic case. Fig. 19.14 shows a FET of channel length L, the corresponding calculated energy band diagrams, the mobile surface sheet charge ns ( x ) in the semiconductor, and the x-directed electric field in the channel. Because the carrier transport is scattering limited, they undergo drift and diffusion as introduced in Chapter 16, section 16.7, characterized by a mobility µ. We call it the ”drift-diffusion” FET. Let us consider an n-channel drift-diffusion FET. The local equilibrium of the mobile electrons at a point x from the source is characterized by a quasi-Fermi level Fn ( x ) = −qV ( x ), where V ( x ) is the local potential. Writing the electrostatic equation to relate the carrier density ns ( x ) at the point x in the channel to the gate voltage Vgs and the local electric potential V ( x ), we obtain the gate-control equation23 e
ns ( x ) nb
(e
ns ( x ) nq
− 1) = e
Vgs −VT −V ( x ) Vth
.
(19.12)
Because of the bias conditions, V ( x = 0) = 0 V at the source, and V ( x = L) = Vds at the drain. The current density at any point x is
23 Note that this equation is similar to
the version used for the ballistic 2D FET earlier in Equation 19.2, with identical definitions of nb = Cb Vth /q and nq = Cq Vth /q, and VT = φb − ∆Ec /q related to the M-O barrier height and the O-S conduction band offset. Since the scattering lets us define a local potential V ( x ) at the semiconductor surface, it appears in the exponent and allows the determination of the local sheet density ns ( x ) for a given Vgs . In the drift-diffusion FET as well as the ballistic FET, V ( x ) = 0 at the source injection point when the source is grounded.
444 Zeroes and Ones: The Ballistic Transistor
found from the local electric field F ( x ) = −dV ( x )/dx: Id ( x ) dV ( x ) = qµns ( x ) F ( x ) = qµns ( x )[− ]. W dx
(19.13)
To relate the local field to the local carrier density, we differentiate Equation 19.12 to get 0.2 0.0 -0.2
dV ( x ) = −Vth ·
-0.4 -0.6 -0.8 -1.0 0.0
0.2
0.4
0.6
0.8
1.0
0.2
0.4
0.6
0.8
x =0
1.0
Id = qµ 0.2
0.4
0.6
0.8
1.0
10
Id
W V L th
= |{z}
6 4 2 0.2
0.4
0.6
0.8
Z x= L
qµ
u( x )=ns ( x )/nq
8
0 0.0
e
Z x= L
0.0
0.0
b
1 nq ) e ns ( x ) nq
ns ( x ) nq
x =0
i
−1
Id ( x ) = −qµ W
dns ( x ) ·
· dns ( x ).
h ( n1 + b
e
(19.14)
5
0.2
0.4
0.6
0.8
1.0
Fig. 19.14 Long channel FET energy band diagrams, mobile charge, and xdirected electric field distribution. The semiconductor bandgap is Eg = 0.6 eV, conduction band edge effective mass is m?c = 0.2me with a valley degeneracy gv = 1, with electron mobility µ = 200 cm2 /V·s. The source and drain are heavily doped n++ , and the oxide barrier is of thickness tb = 5 nm and dielectric constant eb = 20e0 . Note that when Vds = 0.5 V > Vgs − VT = 0.4 V, the carrier density at the drain end is ”pinched off”, and the x-directed electric field Fx ( x ) rises sharply, all the while maintaining Id /W = qns ( x )µFx ( x ) constant from source to drain. The electron quasi-Fermi level in the drain end enters the bandgap in this region of high field, shown in gray circles.
dV ( x ) · ns ( x )
x =0
1 nq ) e ns ( x ) nq
u ( x =0)
ns ( x ) nq
−
1 nb
−1
W V n2 L th q
eu − 1 {z
|
u nb
Z u( x = L) h u u ( x =0)
15 10
Z x= L
i
Z u( x = L) h ( 1 + 1 )eu − 1 nb nq nb 2
W V n L th q
Id = qµ
1.0
dx ·
20
0 0.0
1 nb
−
Since the current is continuous at all x, Id ( x ) = Id , we use dV ( x ) from Equation 19.14 in Equation 19.13 and rearrange to obtain
0.0
0.0
h ( n1 +
nb
+ n1q
+
ueu e u −1
· ns ( x ) i · u du }
1 ueu i du (19.15) nq eu − 1 | {z } ≈u+e−u
Note that we have defined u( x ) = ns ( x )/nq , which takes the values us = ns ( x = 0)/nq at the source, and ud = ns ( x = L)/nq at the drain, obtained by finding ns ( x = 0) and ns ( x = L) directly from Equation 19.12, using the boundary conditions V (0) = 0 and V ( L) = Vds . The approximation to the last term is an excellent one for all u. Thus, W Id = qµ (Vth )n2q L Id = qµ
Z u h d 1 us
(
nb
+
i 1 1 )u + e−u du nq nq
h i nq u2s − u2d W Vth nq (1 + )( ) + ( e−us − e−ud ) L nb 2
(19.16)
is the desired drain current of the drift-diffusion FET as a function of the applied drain and gate biases and the device geometry. Fig. 19.15 shows the current transfer characteristics, and the output characteristics for several cases using Equation 19.16. Let us discuss the FET charge, field, and energy band diagrams in Fig. 19.14 and relate them to the transistor output characteristics in Fig. 19.15. The first energy band diagram shown for Vgs < VT indicates EF is inside the bandgap, with few carriers in the conduction band. The carrier density is significantly enhanced when Vgs − VT = +0.4 V, when
19.6
CMOS and HEMTs 445
1.0
-1
10
-2
10
0.8
-3
10
-4
10
0.6
-5
10
-6
10
0.4
-7
10
-8
10
0.2
-9
10
-10
10
-0.4 -0.2
0.0
0.2
0.0 0.0
0.4
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
Fig. 19.15 Long channel FET characteristics. The gate dielectric has eb = 20e0 and is tb = 5 nm thick. The semiconductor bandstructure information enters the gate control via an effective mass of m?c = 0.2me and a valley degeneracy of gv = 1, and a lumped transport parameter of the mobility µ. The switching characteristics show the typical logarithmic dependence of the drain current on the gate voltage. For an electron mobility of µ = 200 cm2 /V·s, the current drops from Id /W ∼ 0.7 mA/µm to ∼ 0.07 mA/µm, as the gate length increased from L = 0.1 µm to 1.0 µm. This drop in the current drive may be recovered by increasing the mobility from 200 cm2 /V·s to 2000 cm2 /V·s. Note that velocity saturation, quite likely for gate length of L = 0.1µm, is neglected in this example for simplicity.
the Fermi level enters the conduction band. When a drain voltage Vds is applied, the quasi Fermi level Fn ( x ) = −qV ( x ) must change such that it drops by qVds from the source to the drain. Note that the potential difference across the oxide layer at x varies from the source to the drain: it is high at the source end, but for positive drain voltages, small on the drain side. Since the mobile carrier concentration ns ( x ) is proportional to the potential difference Vgs − VT − V ( x ) across the oxide capacitor, the carrier concentration must decrease from the source to the drain24 . Because the current is constant in x, the x-directed electric field F ( x ) = Id /[qµns ( x )] must increase from the source to the drain. Current saturation occurs for Vds > Vgs − VT when the drain side of the channel is pinched off: this behavior is seen in Fig. 19.15 in the second panel, indicated by the dashed line25 . Just as for the ballistic FET, the analytical approximations of the off-state subthreshold and on-state saturation current regimes of Equation 19.16 are obtained by using the two limits of Equation 19.3 of the carrier density ns ( x ): of f
Id Idon ≈ µ of f
Id
≈ qµ
Vgs −VT W nq Vth e Vth for Vgs < VT L
W Cg (Vgs − VT )2 for Vgs > VT and Vds > Vgs − VT . (19.17) 2L
goes exponentially with Vgs , and saturated Idon goes as its square.
19.6 CMOS and HEMTs CMOS: The silicon MOSFET is by far the most widely used transistor. The heart of the device is the MOS capacitor, whose operation
24 The calculated n ( x ) is shown in Fig. s
19.14 for various drain voltages: for the case chosen, the density decreases from the V ( x ) = 0 value of ∼ 7 × 1012 /cm2 at the source to a very small value at the drain end at Vds = 0.5 V. To obtain the carrier density profile ns ( x ), we use the continuity of the drain current yet again to write Id ( x ) = Id ( x = L) in Equation 19.16, and solve for ns ( x ) on the right by replacing L → x. Using the profile of ns ( x ) obtained, the field is obtained from Equation 19.13 and the potential V ( x ) is obtained from Equation 19.12. This directly yields the conduction band profile ns ( x )
since Ec ( x ) = qV ( x ) − k b T ln[e nq − 1]. 25 The rest of the panels in Fig. 19.15 indicate the role of gate length scaling: because Id ∝ µ · W L , increasing the gate length decreases the current drive. For some applications, it is necessary to increase L: for example in high-voltage transistors, since the maximum electric field in the channel must be kept below the critical breakdown field on the drain side. The last panel shows that increasing the mobility can help recover the reduction in current for longer gate lengths. This feature is obtained for example in high-voltage AlGaN/GaN HEMTs, where the critical field and electron mobility is higher than silicon.
446 Zeroes and Ones: The Ballistic Transistor
15
10
14
10
13
10
12
10
11
10
10
10 +
+
+
0.0
1.0
15
10 +
+
+
14
10
13
10
12
10
11
10
10
10
-1.0
0.0
Fig. 19.16 MOS capacitor regimes of operation and semiconductor charge vs. surface potential for doping of 4 · 1015 /cm3 at T = 300K.
1.0 0.8 0.6 0.4 0.2 0.0
-1.0
0.0
1.0
Fig. 19.17 The gate capacitance Cg of a nMOS capacitor potted as a fraction of the oxide capacitance. The lowfrequency capacitance is obtained didψ rectly from C1g = dQss + C1ox , which is derived in Equation 19.18 and the followC 1 ing text. Thus, Coxg = dφs , and Fig. 19.16 shows Qs vs. ψs .
1+ dQ
s
regimes are shown in Fig. 19.16. Instead of the 2D semiconductor channel discussed in earlier sections, the bulk of the silicon (called the body) is doped p-type (upper half of Fig. 19.16), or n-type (lower half). Let us consider the p-type body MOS capacitor: this will lead to the n-channel MOS, or nMOSFET whose conducting channel is a 2D-electron gas that forms as an inversion layer: the semiconductor surface is converted from p-type in the bulk to n-type on the surface (i.e., inverted). The lower half of Fig. 19.16 shows the opposite, or the gate capacitor of the pMOSFET. The nMOS and pMOS together form the complementary MOS, or CMOS that powers much of digital logic and computation today. Let us track the integrated sheet charge in the MOS capacitor for the nMOS (p-type body) shown in the right side plot of Fig. 19.16 by following the charge-field-band diagrams from the left to right. The corresponding measured gate capacitance shown in Fig. 19.17 is a powerful tool in the design and characterization of MOSFETs. When there are no net charges in the semiconductor or metal, there are no electric fields, and the energy bands are flat. This flatband condition is reached when the gate voltage is equal to the difference of the metal and semiconductor work function potentials, VFB = φ M − φS . This is shown in the upper half second column from the left in Fig. 19.16, and corresponds to zero net semiconductor charge in the plot on the right. When the net charge in the semiconductor is zero, the total band
19.6
bending, or the surface potential of the semiconductor ψs is also zero. When the gate voltage is made more negative, the negative charges in the metal plate of the MOS capacitor attracts more holes to the surface than in the flat-band equilibrium condition, causing a surface hole accumulation layer to form26 . Moving to positive Vg , putting positive charge on the metal requires negative charge in the semiconductor for charge neutrality and to terminate the electric field lines originating from the positive charges in the metal. A p-type semiconductor can provide negative charges in two ways: first by depletion, exposing the − negatively charged immobile ionized acceptors of density NA , and if the surface potential is too large, by inversion, when mobile electrons form in the conduction band. To relate the charge in the semiconductor Qs to the surface potential ψs under all of the above conditions: accumulation, depletion, and inversion, we solve Poisson’s equation for a downward band bending ψ( x )27 :
d2 ψ dx2 dψ dψ 2 · 2 =2 dx | dx {z dx } d2 ψ
d2 ψ q = − [ p( x ) + ND − n( x ) − NA ] es dx2 ψ ψ q − + = − [ p p0 e Vth − p p0 − n p0 e Vth + n p0 ] es ψ ψ q − + · (− )[ p p0 (e Vth − 1) − n p0 (e Vth − 1)] es
dψ 2 d dx [( dx ) ]
ψ 2qp p0 Z ψ( x2 ) n p0 + Vψ dψ 2 x2 − ) =− dψ[(e Vth − 1) − (e th − 1)] dx x1 es p p0 ψ ( x1 ) s √ ψ n p0 Vψs Qs 2Vth ψs ψs − s = Fs = (e Vth + − 1) + (e th − − 1). (19.18) es LD Vth p p0 Vth
(
The boxed equation uses Gauss’s law Qs = es Fs , where Fs is the q electric field at the surface of the semiconductor, L D = es k b T/q2 p p0 is the Debye length in the semiconductor, and ψs is the surface potential. It provides the relation between net semiconductor charge Qs and ψs , which is plotted on the right side of Fig. 19.16. Since the excess gate voltage is related to the surface potential and the semiconductor charge by Vg − VFB = ψs + CQoxs , we get the gate capacitance Cg = dQs /dVg by simply differentiating this equation w.r.t. Qs to dψ obtain C1g = dQss + C1ox , which is plotted in Fig. 19.17. The gate capacitance approaches Cox for both accumulation and inversion, and drops in the depletion regime. The threshold voltage for the inversion channel to form is higher for larger NA . The gate capacitance is different in the low- and high-frequency regimes28 . Fig. 19.18 shows how the conductive inversion layer at the surface of the MOS capacitor is contacted by the appropriate heavily doped source/drain regions to create the MOSFET. Note that the S/D contact regions are doped the opposite type to the semiconductor body to
CMOS and HEMTs 447
26 The excess accumulation hole sheet
density rises exponentially with the surface potential ψs , which is now negative. The electric field points from the positively charged holes to the negative charge on the metal, with most of the voltage drop occurring in the oxide layer. The corresponding measured gate capacitance at accumulation is therefore expected to be, and indeed is, close to the parallel-plate value Cox = eox /tox as seen in Fig. 19.17 which we derive quantitatively below.
27 Here we have used that deep inside
the semiconductor bulk, ND − NA = n p0 − p p0 , p p0 = NA and p p0 n p0 = n2i . For all x, p( x ) = p p0 e
ψ th
−V
and n( x ) =
ψ +V th
n p0 e . We substitute these in the Poisson equation to solve it. To evaluate the integral, we use x1 = +∞ is deep inside the semiconductor bulk, and x2 = 0 is the surface of the semiconductor. Since ψ( x ) is the downward band bending, it controls the electron density as n( x ) = n p0 e+ψ( x)/Vth and thus ψs is the net surface band bending, or the surface potential. Because electrons pile up as ψ becomes more positive, the surface of the semiconductor is inverted as ψs approaches the bandgap of the semiconductor.
28 The surface inversion layer of electrons
in the p-type semiconductor must form by thermal generation, which takes time. As shown in Fig. 19.17, the inversion capacitance is different if the gate votage is swept at a low frequency (LF), which gives the inversion layer time to form. In this case, Cg → Cox . When the gate voltage is swept at a high frequency (HF) condition, the inversion layer does not form. This property enables the use of MOS capacitors in dynamic random access memories (DRAM). See Exercise 19.9 for more.
448 Zeroes and Ones: The Ballistic Transistor
Fig. 19.18 From MOS capacitors to pMOS and nMOS FETs to the CMOS inverter.
Fig. 19.19 Operation regimes of a nFET and a pFET. The transfer characteristics on top show the drain currents in linear and log scales as a function of the gate voltage. The nFET turns on for Vgs > VTn and the pFET for Vgs < VT p . The output characteristics of Id vs. Vds are shown at the bottom. Fig. 19.18 shows the CMOS configuration using the nMOSFET and pMOSFET.
form a low-resistance contact to the inversion layer. The inversion layer is only present when Vgs > VTn for the nMOSFET, and Vgs < VT p for the pMOSFET. Under these conditions, current flows through the conducting inversion layer between the source and the drain contacts as indicated Fig. 19.19. The nMOS and pMOS FETs connected as shown in Fig. 19.18 form the complementary MOS, or CMOS inverter. The gates of the pMOSFET and nMOSFET are shorted to form the input terminal, and the drains are shorted to form the output terminal of the inverter. The source contact of the pMOSFET is connected to a positive constant drain voltage Vdd line, and the source of the nMOSFET is connected to the ground line. When the input voltage on the gate is low (Vin = 0), the nMOSFET is off since Vgs = 0 < VTn as seen in Fig. 19.19. However, since for the pMOSFET Vgs = Vin − Vdd < VT p , the pMOSFET is on, and the pMOS channel resistance is low. The pMOSFET therefore brings the potential Vout of the output terminal to Vdd , which is a high voltage. On the other hand, when Vin is high, the pMOSFET turns off, but the nMOSFET turns on, pulling the potential Vout to its source potential, or ground. Thus, the circuit converts a low Vin or a logical zero to a high Vout or a logical one, and a logical one to a logical zero, earning this circuit the name of a logical inverter. The input-output out transfer curve is shown in Fig. 19.18, and its derivative − dV dVin in the transition regime defines the gain of the CMOS inverter. The CMOS inverter is a very energy efficient circuit, since there is power dissipation only when the input bit flips; very little power is dissipated to hold the output bit as the inverted input bit when there are no bit flips at the input. Now the MOSFET device can be a long-channel FET, or a very short channel device which is nearly ballistic. Fig. 19.20 shows the energy band diagram under the gate, another from the source to the drain, and also a k-space occupation function at the source injection point as discussed earlier in this chapter. The major difference from a 2D
19.6
CMOS and HEMTs 449
Fig. 19.20 Illustrating the physics of a ballistic FET under nonzero drain bias. Field effect transistor, energy band diagram, and k-space occupation of states.
semiconductor channel is the presence of the depletion thickness barrier and potentially several subbands in the triangular quantum well at the semiconductor surface, the lowest of which is labeled as Enz ( xmax ). The model discussed in the earlier sections of this chapter apply to the MOSFET both in the ballistic limit, and the drift-diffusion limit because the strong confinement of the carriers justifies treating them as 2D systems. Fig. 19.21 shows the practical structure of modern planar MOSFETs, and its vertical embodiment, the FinFET. The gate stack consists of a high-κ/metal gate with a spacer covering the sidewalls. In the planar MOSFET, the electric field is exerted by the gate charges vertically into the semiconductor surface. The current flows between the source and the drain on the surface inversion sheet of width W under the gate of length L, defining the W/L that appears in the drift-diffusion FET model. Because of the need for better electrostatic control of the conducting channel for very small gate lengths, new MOSFETs use channels shaped as a fin, which turns the channel by 90◦ . The gate exerts the field through the sidewalls of the fins. For this reason, this geometry is referred to as the tri-gate MOSFET, or FinFET. An increase in transistor density can be obtained by bringing the fins closer to each other. Furthermore, the fin may be broken up into several nanowires (or 1D quantum wires), and gates can be wrapped around each wire, forming stacked gate-all-around (or GAA) FETs. HEMTs: Fig. 19.22 shows the cross-section construction of high electron mobility transistors (HEMTs), which are also called heterostructure field-effect transistors (HFETs). HEMTs typically use III-V semiconductors with much higher electron mobilities. The electrons are typically located in a quantum well at an epitaxial heterojunction which is completely crystalline, as opposed to the silicon MOSFET, where the
Fig. 19.21 Planar MOSFET and FinFETs. The conducting channel in FinFETs is inside the fin, and gated from the sides. The FinFET geometry was introduced by Hisamoto, Kaga, and Takeda in 1991 (see Exercise 19.2).
450 Zeroes and Ones: The Ballistic Transistor
+
+
+
+
+
+
+
+
Fig. 19.22 Layer structures and energy band diagrams of GaN and InP HEMTs showing the quantum well (QW).
Fig. 19.23 Performance metrics of a FET.
inversion channel is formed at the interface between the crystalline semiconductor and the amorphous oxide. Because of this reason, the electrons in the HEMT suffer far lower interface roughness scattering. Furthermore, the mobile electrons in HEMTs are spatially separated from dopants, reducing the Coulomb scattering. The combined reduction of scattering mechanisms leads to typically high mobilities in HEMTs. The mobile carriers are provided for example by modulation doping as shown for InAlAs/InGaAs HEMTs, or by polarizationinduced doping in AlGaN/GaN HEMTs. Because of the high mobilities and high electron saturation velocities, for the same gate length as MOSFETs, HEMTs typically are faster. Compared to silicon nMOS, AlGaN/GaN HEMTs have 10 times higher mobilities, and at the same time offer much higher breakdown voltage capability. These properties make HEMTs attractive for high-speed transistor applications as high-frequency RF amplifiers, and high speed switching of large voltages for energy-efficient power electronics. Performance metrics: Fig. 19.23 shows the performance metrics of a general FET. In addition to the highest possible drain current, the maximum drain voltage Vbr that may be applied without breaking down the transistor is an important metric. Since this increases the net area under the Id − Vds curve, it enables high power amplifiers in high-speed electronics, and also high-voltage switching for power electronics. While the maximum on-currents are typically in the range of 1 mA/µm, Vbr can vary widely for different semiconductor technologies and device geometries; in general, a larger bandgap semiconductor leads to a higher Vbr . The maximum power is Pmax ≈ 1 max · (V − V br knee ), a part of the shaded rectangle in Fig. 19.23. Fig. 8 · Id 19.23 shows the transfer curve Id vs. Vgs , and its derivative, which is ∂Id defined as the transconductance of the FET, gm = ∂V . The transcongs ductance takes the role of the gain of the FET; for example, if a load resistance R L is connected to the drain, a change in the gate voltage ∆Vgs appears across this load as (∆Id ) R L , implying the voltage gain (∆Id ) R L of the FET is ∆V = gm R L . The transconductance, like the current, gs is expressed per unit length; typical values for high-performance FETs are in the 0.1 − 1.0 mS/µm range. Writing the saturation current as Id /W ∼ qns vsat where vsat is an ensemble saturation velocity, we get gm ∼ Cg vsat , since Cg = d(qns )/dVgs . Fig. 19.23 shows the voltage and power gain of a FET as a function of the frequency of the voltage signal applied at the gate terminal. The intrinsic speed of the FET is related to the time τch it takes for the vsat 1 electron to traverse the channel of length L via f T = 2πτ = 2πL . The ch voltage gain gm R L goes to unity at this characteristic frequency, called f T . The maximum frequency at which power (∝ Id · Vds ) amplification can be obtained is called f max . Due to the presence of additional delays due to capacitances and resistances extrinsic to the device, a lower f T is typically obtained. For FETs, the voltage gain cutoff frequency f T
19.7
Source/drain ohmic contacts 451
and the unilateral power-gain cutoff frequency f max are given by 1 1 = 2πτ 2π (τsource + τch + τdrain ) gm gm vsat fT = ≈ ≈ 2π [(Cgs + Cgd )(1 + gds Rs ) + Cgd gm Rs ] 2πCgs 2πL fT f max = r .(19.19) Cgd 2 gds ( R g + Rs + rch ) + gm R g Cgs +C fT =
gd
The gate resistance29 does not appear in f T . As an example, for vsat ∼ 107 cm/s and L = 50 nm, the intrinsic f T ∼ 320 GHz, but due to parasitics it can typically be lowered to ∼ 200 GHz. The value of the f max is sensitive to the gate resistance, requiring special mushroom, or T-shaped gates as shown in Fig. 19.22 to keep the gate length small at the foot, but at the same time lower the gate resistance by making the head fat. Modern very short gate length CMOS and HEMTs can reach f T ∼ 0.5 − 1.0 THz and f max in the same range. A primary impediment to even higher speeds are short-channel effects such as drain-induced barrier lowering (Exercise 19.11), due to which Id fails to saturate, leading to large gds . The other impediment is that contact resistances degrade the gain (Exercise 19.15), and are notoriously difficult to lower beyond certain limits posed by quantum mechanics of crystalline solids, which we discuss next.
29 The gate-source capacitance C is degs
sired, and the primary extrinsic undesired capacitance is the gate-drain capacitance Cgd . The undesired resistances are the source resistance Rs , the resistance of the gate metal R g , and the output conductance gds . The change in the drain current cannot follow the gate voltage change any faster than it takes the electron to go from the source to the drain. This offers an intuitive explanation for f T ≈ vsat /2πL in Equation 19.19.
10
19.7 Source/drain ohmic contacts We discussed the construction of tunneling ohmic contacts in Chapter 18, Fig. 18.11. The goal is to make the source resistance Rs and drain resistance Rd of the FET and the resulting voltage drop Id ( Rs + Rd ) in the contacts is as small as possible compared to the drain voltage Vds that drives the current. In Chapter 17, Section 17.2 we discussed that there is a quantum limit of how low a resistance a semiconducting or metallic channel in a crystalline solid can attain. The maximum conductance for a 1D mode is 2q2 /h, and conductances of several modes add to give a net contact resistance normalized to the 0.026 width of a 2D conducting channel Rc · W ∼ 2q2hk ∼ √ n2d kΩ · µm. ExF perimental contact resistances can today reach close to the quantized conductance limits, as indicated in Fig. 19.24. Achieving the low contact resistance limits requires significant efforts in doping and metal choices catered towards the chemistry and physics of each semiconductor/metal choice. This quantum limit constrains the speed and performance on the latest generation of nanoscale FETs, both in digital and high-speed amplifier applications. We saw in the earlier sections of this chapter that the best nanoscale FET channels are capable of carrying currents in the Id ∼ 5 mA/µm range at Vds ∼ 0.5 V in the ballistic limit. The voltage drop at the source and drain contacts in the Id quantum limit for n2d = 1013 /cm2 is then Vc = W ( Rs · W + Rd · W ) =
1
0.1
10 -2
10 -3
0.1
1
10
Fig. 19.24 Source and drain contact resistances and their quantum limits for crystalline semiconductor FETs with 2D conducting channels. Since each 1D mode can have a conductance maximum of 2q2 /h, the resistance of a 2D channel composed of many 1D modes is Rc · W ∼ h 0.026 ∼ √ n2d kΩ · µm scaled to the width 2q2 k F of the contact. The resistance is between the heavily doped 3D regions and the 2DEG like in the MOSFET.
452 Zeroes and Ones: The Ballistic Transistor
30 Because the work functions of elemen-
tal metals range from 2.5 − 6.0 eV, it is difficult to find optimal choices for both n- and p-contacts to very wide bandgap semiconductors such as nitrides (e.g. AlGaN and AlN) and diamond whose bandgaps can exceed 5 eV.
31 This is similar to a typical dilemma
of choosing air travel over trains: most of the time spent in flying a short distance goes into getting to the airport and checking in, boarding, and getting off: the flight time is ’ballistic’, or minuscule in comparison. This is analogous to a short channel, or ballistic FET: a significant portion of the delay and voltage drop is in the contacts, and not in the channel. For traveling a long distance, flight is an easy choice since the flight time is much longer than the waits at the airports, mimicking the fact that the channel resistance and delay is longer than in the contacts for a long-channel FET.
Id · 2Rc ∼ 0.26 V, which is unfortunately a significant fraction of Vds . Even at a current of 1 mA/µm, the voltage drop at the contacts is far from a negligible fraction of Vds . This is why a significant effort in lowering the contact resistances is needed to harvest the fruits of scaling the channel lengths, because Id2 ( Rs + Rd ) is an irrecoverable loss of energy in the device. This loss increases fast for several billion transistors! The source and the drain contacts are designed for as low resistance metal-semiconductor ohmics as possible, by making use of heavy doping in the semiconductor and choosing low-barrier height metals. Because in an n-type channel FET the current is carried by mobile electrons in the conduction band, a low-resistance ohmic contact is necessary to the conduction band of the semiconductor. For a p-channel FET a low-resistance ohmic contact is necessary to mobile holes in the valence band. As a result, a typical choice for an nFET ohmic n contact is a metal whose Fermi level EFM nearly lines up with the conduction band edge Ec of the semiconductor, so that the Schottky barrier height is the smallest possible. The n-contact resistance is lowered further by doping the semiconductor with donors heavily near the metal/semiconductor junction to reduce the depletion region to a few nms, which enables efficient tunneling. Similarly, for a pFET the p metal Fermi level EFM must line up as close as possible to the valence band edge EV of the semiconductor. The ideal n-ohmic metal and p-ohmic metal for a given semiconductor30 must therefore have the difference of their work functions close to the bandgap of the semin − Ep conductor EFM FM ≈ Ec − Ev = Eg . The practical choice of ohmic metals involves several factors concerning the interface, chemical, and processing stability and reliability. Though the contact resistance for all FETs should be made as small as possible, it is especially necessary for short-channel FETs. This is because as the distance from the source to the drain is scaled down, the channel resistance reduces and approaches the ballistic limit, but the contact resistance does not reduce. The contact resistance is not modulated by the gate either. This causes an all-around loss of control and energy efficiency of the transistor. As we discussed in the last section, the contact resistances significantly restrict the speed of switching and amplification by the FET. If the S/D distance is very long, and the resistance of the channel dominates over the contact resistance, the deleterious effects of the contact resistance are partially suppressed31 .
19.8 A brief history of FETs The idea of the field-effect transistor was conceived much earlier than bipolar transistors. The earliest patent was filed in 1925 by Lilienfeld (Fig. 19.25), who had moved from the University of Leipzig in Germany to the United States for his entrepreneurial pursuits. The patent put forward the idea that if the conductivity of a material could be
19.8
modulated by field-effect, it could perform the same operation as the electronic amplification that was at the time achieved by using vacuum tubes. The understanding of semiconductors and their underlying conductivity occured in the early 1930s, after Bloch’s concepts of electron wave transport in crystals, and Alan Wilson’s subsequent explanation of the difference between metals and semiconductors based on filled and empty bands. In 1945, Heinrich Welker (Fig. 19.26) patented the idea of a junction-FET or JFET. Research groups at the Bell laboratories attempted to demonstrate the field-effect phenomena driven by William Shockley’s ideas, but the electrons in dangling bond surface states thwarted these efforts. The electric field lines that were supposed to penetrate into the semiconductor and contribute to the modulation of conductivity were essentially terminated at the surface states. That the surface states were responsible for the lack of field-effect was identified by John Bardeen, after examining several experimental efforts of his colleague Walter Brattain. In fact, they could demonstrate a weak field-effect with electrolyte gating, but not in the solid state. In these investigations, they chanced upon a different kind of device, the point contact transistor, the precursor to the first semiconductor bipolar junction transistor, discussed in Chapter 18. Motivated by the idea that the depletion region of a reverse-biased pn junction increases with the reverse-bias voltage, Shockley in 1952 quantitatively evaluated the output characteristics of a JFET and offered a blueprint of its design. A JFET was subsequently demonstrated by Dacey and Ross at the Bell Laboratories in 1953 as the first operational FET. Meanwhile, in 1953 when investigating the properties of n-p-n bipolar transistors with a floating collector, W. L. Brown in Bell Laboratories made the surprising observation that a floating collector terminal could attain the same potential as the emitter, in spite of being separated by a p-layer. He was able to prove that this was because the surface of the p-type layer was inverted, forming a conducting electron channel. Soon after, efforts to electrically passivate the surface states using SiO2 succeeded, significantly lowering the surface recombination currents that hurt the performance of the rapidly advancing state-of-the-art bipolar junction transistors. In 1959, Mohammed Atalla and Dawon Kahng (Fig. 19.27) working at the Bell Laboratories were able to stabilize the surface of silicon and succeeded in realizing the first silicon MOSFET. In 1963, CT Sah and Frank Wanlass (Fig. 19.28) of Fairchild Semiconductors conceived of and demonstrated the first complementary MOS (CMOS) logic inverter. Several variants of the FET soon followed, for example Carver Mead demonstrated the metal-semiconductor FET (MESFET) using a Schottky gate in 1966, and Takashi Mimura introduced the modulation-doped high-electron mobility transistor (HEMT) in 1979. In 1980, a fundamentally new physical phenomenon, the integer quantum Hall effect was observed in FETs, followed by the fractional quantum Hall effect in 1982 (these are discussed in Chapter 25). To date, more than 1022 transistors have been fabricated, which is more than the number of stars in the Milky Way galaxy! It is likely the largest num-
A brief history of FETs 453
Fig. 19.25 Julius Lilienfeld filed the earliest patent for the field-effect transistor in 1925, before the physics of semiconductors, or even the fundamentals of quantum mechanics were understood.
Fig. 19.26 Heinrich Welker, Arnold Sommerfeld’s student, realized the earliest junction-Field Effect Transistor, or JFET. Welker was a pioneer in III-V semiconductors and heterostructures.
Fig. 19.27 Atalla and Kahng realized the first MOSFET in Bell Laboratories in 1959.
Fig. 19.28 In 1963, CT Sah and Frank Wanlass realized and demonstrated the remarkable properties of the CMOS circuit by combining the nMOS and pMOS in its now iconic format.
454 Zeroes and Ones: The Ballistic Transistor
ber of devices that have been manufactured in the micro- and nanoscale to exacting standards by the human race. The FET has changed way logic, memory, and communication is performed on earth (and beyond!) in a profound manner. In addition to fundamentally changing the fabric of logic, memory, and communication systems, the FET has contributed to significant discoveries in several other fields, such as in radio astronomy, weather prediction and metrology, medicine and biology, and imaging and sensing (e.g. the CCD camera is based on MOSFETs). The increased computing power by scaling of FETs enabled significant advances in the capacity to train computers to ”think” and ”learn” (socalled machine-learning), and is advancing artificial intelligence. In fact, in the past few decades the FET has continued to be ”written off” as impossible to improve. But every few years there have been breakthroughs, and FETs have emerged better, faster, more efficient, and far more capable.
19.9 Chapter summary section In this chapter, we learned:
• The FET operates by gate voltage control of the drain current by controlling the number of mobile carriers in a conductive channel by field-effect. The drain current is controled exponentially below a threshold voltage Id ∼ eVgs /Vth , and in a polynomial fashion above threshold, Id ∼ (Vgs − VT )n . In the on-state, the carrier density for 2D channel FETs is ∼ 1013 /cm2 , and the on-state current densities are ∼ mA/µm, even when there is no scattering in a ballistic FET. • If there is significant scattering in the channel, the carrier mobility plays an important role, and so does the gate length, since Id ∼ µW L . Then, scaling the FET to smaller gate lengths improved the device performance. • The silicon MOSFET operates with an inversion channel as the conducting layer. Since both nFETs and pFETs can be realized on the same wafer, it enables energy efficient complementary MOS, or CMOS circuits. On the other hand, high electron mobility transistors (HEMTs) based on III-V semiconductor heterostructures are used for power amplifiers that can operate at high efficiencies at very high frequencies.
Further reading 455
Further reading Sze’s Physics of Semiconductor Devices treats the MOSFET and the zoo of various FETs most comprehensively using the drift-diffusion theory. Grove’s Physics and Technology of Semiconductor Devices is a classic. Fundamentals of Modern VLSI Devices by Taur and Ning has a modern treatment, and includes in some chapters the ballistic FET models. The ballistic FET models are
most extensively discussed in the short book Nanoscale Transistors by Lundstrom and Guo. I strongly recommend reading Natori’s 1994 Journal of Applied Physics paper in which he introduced the ballistic FET model. Every device physicist has a personal favorite book for (mostly) unexplained reasons, mine is Device Electronics for Integrated Circuits by Muller, Kamins, and Chan.
Exercises (19.1) Practice problem: long channel FETs Equations 19.17 give the standard ”long-channel” model current-voltage characteristics of FETs as a function of gate voltage. Answer the following questions: (a) Calculate the saturated on-current for a channel mobility of µ = 150 cm2 /V·s, tb = 10 nm, eb = 3.9e0 , L = 1 µm, and W/L = 10 at Vgs − VT = 1 Volt. (b) Note that the saturated current in part (a) does not depend on Vds . For what values of Vds is this approximation correct? (c) Typical saturation current densities per unit width Idon /W approach 1 mA/µm in most FETs today. How does the above calculation compare? (d) Calculate the off-state current per unit width of f Id /W for the same dimensions as above, and Vgs − VT = −1.0 V. What is the on/off ratio of f Idon /Id for Vgs − VT = ±1 Volt? (e) Examine the impact of the geometric term W/L in the drive current of the FETs. How can one maximize the on-current for a given voltage, and what are the tradeoffs? The ration W/L is an important design parameter in the layout of transistors in circuits in microprocessors.
(19.2) Shedding the fat: DELTA In 1991, Hisamoto, Kaga, and Takeda of Hitachi Central Laboratories in Japan introduced the DELTA, a fully depleted lean-channel transistor. Fig. 19.29 shows the FET, an energy band diagram, and the sub-threshold slope as a function of the gate length. This work is one of the earliest demonstrations of the FinFET geometry of transistors. Answer the following questions about this work. (a) Read the paper critically, and note down the advantages of this geometry as described by the authors to prevent several problems encountered in planar transistors. (b) Discuss how they prove these advantages experimentally in the output characteristics of the FETs. Most of the advantages stem from removing the ”bulk” semiconductor, which is the effective ”fat” in the device. (c) Describe how the design of today’s FinFETs with channel lengths of below 10 nm can trace its origins back to this initial demonstration.
456 Exercises J0 = q · h¯mN?c , and all symbols have their usual c meanings as they appear in the book, and Nc is the effective conduction band edge DOS. (c) Make a plot of the ballistic FET currents of the three semiconductors of part (a). Make the drain current Id vs. gate voltage Vgs in the linear and log scales, and the drain current Id vs. drain voltage Vds plots, similar to those shown in the figures in this chapter. (d) Describe qualitatively what sorts of changes in the device characteristics would you expect if instead of the 2DEG channel, you had a 1D channel in the ballistic FET. Remember you have shown before that the ballistic conductance per 1D channel is limited to the quantum of conductance G0 = q2 gs gv h , where h is the Planck’s constant. (19.4) Double-gate and gate-all around FETs In FETs made with semiconductor channels in the form of thin membranes such as in silicon-onInsulator (SOI) or for 2D semiconductor crystals, gates above and below the plane of the semiconductor channel (see Figures 19.5 and 19.6) provide stronger gate control of the channel conductivity. Answer the following questions:
Fig. 19.29 The DELTA FET, from IEEE Transactions on Electron Devices, vol. 38, page 1399 (1991). This work is one of the earliest justifications and experimental demonstrations of the advantages of the FinFET.
(19.3) The ballistic field-effect transistor In the chapter, we derived the characteristics of a ballistic field-effect transistor. You are going to fill in a few steps, and solve a closely related problem. (a) Make log-scale and linear-scale plots of the gate-induced 2D electron gas (2DEG) carrier density at 300 K and 77 K vs. the gate voltage Vgs of FETs for an insulating barrier of tb = 2 nm, and eb = 10e0 for three semiconductor channels: one that has m?c = 0.2m0 , gs = 2, and gv = 2, the second has m?c = 0.2m0 , gs = 2, and gv = 1, and the third has m?c = 0.05m0 , gs = 2, and gv = 1. What is the difference? Compare with the figures in this chapter. (b) Show why the ballistic current density is given by J2d = J0 [F1/2 (ηs ) − F1/2 (ηs − vd )], where
(a) Show that for symmetric barriers on top and the bottom with dielectric constant eb and thickness tb , the gate capacitance Cg is twice of a single barrier of the same material and geometry. (b) Show that the long-channel saturated current drive per unit width Idon /W increases for the same Vgs − VT . Show that this boost in current drive with higher Cg remains true for the ballistic FET characteristics (Equations 19.7), though the quantitative dependence is different. (c) The increased gate control is crucial in making FETs with shorter channel lengths and reduced dimensions. For 1D channels, wrapping the gate all around the channel (in the gate-all-around geometry) allows scaling to short channel lengths. Discuss the progress made in this direction in nanoscale FETs today, and the challenges an opportunities in their realization. (19.5) Barrier and quantum capacitance ratios In the quantum mechanical ballistic model of the FET, the on-current depends on the ratio of the carrier and quantum capacitance Cb /Cq in a non-
Exercises 457 monotonic manner. Examine this dependence, and as discussed in the text, show that for an optimal ratio the drive current is maximized. Discuss how this optimal choice dictates how the design of the barrier dielectric material and its thickness is tied to the electron bandstructure properties of the semiconductor channel. (19.6) Jump-start: the injection velocity Equation 19.8 showed that the injection velocity vinj at the source side of the channel in a ballistic FET was found to play a central role in its output characteristics. (a) Discuss what properties of the semiconductor at the source are desirable to maximize the satusat for a given V − V . rated injection velocity vinj gs T Note that the gate capacitance Cg also includes the properties of the semiconductor channel via the quantum capacitance. (b) Narrower bandgap semiconductors with small m?c boost the injection velocity. But what limitations do they pose in achieving other desirable transistor characteristics (e.g. a high on/off ratio)? (19.7) Linearity of field-effect transistors For many applications in analog electronics, the dependence of the FET drain current Id on the gate voltage Vgs − VT is desired to be highly linear. Answer the following questions: (a) Based on the dependence of the long-channel (drift-diffusion) and ballistic FETs, discuss which devices exhibit high linearity in the saturated onstate current regime. (b) Show that if a sinusoidal signal of the form Vgs eiω0 t is fed into the gate, then only a perfectly linear transistor characteristics will faithfully produce an amplified signal in the drain current Id at the same frequency ω0 . Prove that any nonlinearity in the Id vs. Vgs − VT characteristics will produce components of drain current at harmonics of ω0 . (c) For a perfectly linear FET, the gm = ∂Id /∂Vgs should be a constant, and all higher deriva0 00 tives such as ( gm ) = ∂2 Id /∂(Vgs )2 , ( gm ) = 3 3 ∂ Id /∂(Vgs ) , ... must be zero. Examine whether this is the case for the long-channel drift-diffusion FET models and the ballistic FET models that are described in this chapter. If not, what factors contribute to the non-linearity?
(d) Discuss strategies by which the transistor channel design can be modified to make the FET intrinsically more linear. What bandstructures lead to higher linearity? (e) In the chapter, we have neglected the effect of source and drain access and contact resistances on the external gm vs. Vgs − VT characteristics. Argue why such resistances degrade the maximum value of the gm , but can increase the linearity. (f) Some design paradigms adopt a circuit-level approach with multiple VT transistors in parallel while using the same channel semiconductor. Explore and discuss these strategies. (19.8) Metal-semiconductor FETs (MESFETs), junction FETs (JFETs), and so on... A large array of types of field-effect transistor geometries exist, with relative merits and demerits in terms of their performance, ease of fabrication. For example, the MESFET uses a metal-semiconductor Schottky gate on doped semiconductor channels. The junction-FET (JFET) uses a pn-junction gate to take advantage of large breakdown field and low leakage current of pn junctions over Schottky diodes. Perform a survey of the various types of FETs, and summarize your results in the form of table for the following. Use typical FET metrics in the columns and different FETs in the rows for comparison. Typically a FET of a given type is attractive for certain applications –list the suitable application in one of the columns. (19.9) Short-term memory: MOS DRAM
Fig. 19.30 Robert Dennard, the inventor of the 1-transistor 1-capacitor DRAM cell. He was also an original architect of the constant field scaling rule of transistors.
In a silicon MOS capacitor of Fig. 19.16, the capacitance-voltage characteristics were shown in
458 Exercises Fig. 19.17, and its transient behavior was discussed. The MOS capacitor is used as a dynamic random-access memory (DRAM) in modern microprocessors. Answer the following questions for exploring how the device is used for this purpose. (a) Suppose for time t < 0 a p-type bulk Si MOS capacitor was in a slight accumulation regime at Vgs = VFB − 0.5 Volt. Note that a p-type body implies a NMOS capacitor. If at time t > 0, a gate voltage Vgs = VFB + 1.0 Volt is applied, and kept at that bias. Describe the physics of the time evolution of the gate capacitance. (b) Discuss the microscopic processes that lead to, initially, a large drop in the capacitance, and ultimately in the increase to the value Cox .
drain current in a FET must become independent of the drain voltage, and only be controlled by the gate voltage. In other words, the barrier height seen at the source end of the channel for electrons must only be controlled by the gate voltage, and not the drain voltage. However, for short gate lengths a large drain voltage starts competing with the gate in controlling the barrier height at the source-injection point. Discuss typical experimental indicators of DIBL and the methods used to keep it at bay. (b) Gate-induced drain leakage (GIDL): Band-toband tunneling currents (which are discussed in Chapter 24) are enhanced by the gate voltage in short-channel FETs on the drain side where a high field is reached. This results in gate-induced drain leakage currents. Discuss the experimental indicators and methods to counter GIDL in short-channel FETs.
(c) Argue why the carriers for the inversion channel must necessarily come from the valence band by interband transition. Over what time scale can (19.12) Discrete dopant effects the capacitance remain at the low value? Consider a cube of silicon of side 10 µm. Calculate roughly the number of dopants in it for dop(d) Discuss how this delay is used as the meming densities of Nd = 1016 /cm3 and Nd = 1020 ory in DRAM. Explain why it is called ”dynamic” /cm3 . Now repeat the calculation for a silicon cube - meaning, explain why the data desired to be of side 10 nm. Discuss why since the doping denstored needs to be refreshed dynamically. sity maintains the Fermi level of the semiconductor, any variation of the dopant number will lead (e) Very high density of DRAM memory are to a variation in the important device characterisachieved today by using 3D capacitance geometics ranging from threshold voltages to variability tries - discuss these innovations, and challenges in from one transistor to another. What methods are DRAM technologies in the future. used to fight the discrete dopant effect in nanoscale (19.10) Moore’s ”law” transistors? Discuss the statement of Moore’s law of scaling of transistors, and the fundamental quantum me- (19.13) Strained silicon Application of strain to a semiconductor modifies chanical and practical limits encountered in scalits bandstructure. If done correctly, it can boost ing transistors below a few nm channel lengths. the electrical conductivity by rearranging the bands Several semiconductor materials and device innosuch the light effective mass carriers are populated vations have been made in the past to keep the in preference to heavier ones. Explain how strained Moore’s law scaling on track for several decades. semiconductors have been used to improve fieldWrite a short essay and your view of the future of effect transistor performance via mobility enhancethis path: and remember that several times in the ment in silicon technology. past, ”the news of the death of Moore’s law have been greatly exaggerated”! (19.14) HKMG: high-K, metal gate (19.11) DIBL and GIDL Examine the following two common mechanisms that result in degradation of transistor characteristics at very short channel lengths: (a) Drain-induced barrier lowering (DIBL): The output characteristics in the saturation regime of
Discuss why in the mid 2000s the silicon CMOS industry made a transition from polysilicon gate with SiO2 dielectric to high-K dielectrics and metal gates to prevent excessive gate leakage currents by tunneling. The HKMG method paved the way for the introduction of the FinFET geometry, discuss why this method is suited well for the vertical geometry and its use in the future.
Exercises 459 (19.15) Gain and speed of FETs (a) Discuss the tradeoffs between the gain and speed of field-effect transistors. (b) Since the current gain cutoff frequency f T is dependent on the gate length, a common metric is the f T · L g product for a semiconductor material. Argue why this product is related closely to an effective saturation velocity of carriers in the FET. (c) The Johnson’s figure-of merit (JFOM) is a product of the saturation velocity of carriers vsat and
the critical breakdown field Fbr . Make a table of the JFOM for various semiconductor materials and discuss the advantages of wide-bandgap semiconductors for this metric. (d) Make a table of the fastest FETs made of various semiconductor materials to date, including their gate lengths, contact resistance, cutoff frequencies, and breakdown voltages. Note that the fastest transistors today exceed 1 THz cutoff frequencies by virtue of high carrier velocities and low parasitic resistances and capacitances.
Fermi’s Golden Rule
20
In this chapter, we derive and get comfortable with Fermi’s golden rule (Fig. 20.1), a central result of time-dependent perturbation theory. Because the golden rule provides a prescription to compute transition rates between quantum states, it is used heavily in subsequent chapters to understand electronic transport, and photonic properties of semiconductors. In this chapter, we learn:
• How are time-dependent quantum processes modeled? • Under what conditions is Fermi’s golden rule valid? • How can we use the golden rule to understand electronic and photonic processes in semiconductors? • How do time-dependent processes determine the operation of qubits in quantum computation?
20.1 Fermi’s golden rule
461
20.2 Oscillating perturbations
465
20.3 Transitions to continuum
467
20.4 Kubo–Greenwood formula
468
20.5 Decoherence in qubits
470
20.6 Electron-electron scattering
473
20.7 Dyson series and diagrams
475
20.8 Zero-sum game: self energy
476
20.9 Chapter summary section
477
Further reading
477
Exercises
477
20.1 Fermi’s golden rule Consider an unperturbed quantum system in state |Ψt0 i at time t = t0 . It evolves to the state |Ψt i at a future instant t. The time evolution of the state vector is governed by the unperturbed Hamiltonian operator Hˆ 0 according to the time-dependent Schrodinger equation ¨ i¯h
∂ |Ψt i = Hˆ 0 |Ψt i. ∂t
(20.1)
If the system was in an eigenstate |Ψt0 i = |0i of energy eigenvalue E0 at time t0 , then the state at a future time differs from the initial state by only a phase factor: Hˆ 0 |Ψt0 i = E0 |Ψt0 i =⇒ |Ψt i = e−i
E0 h¯ ( t − t0 )
| Ψ t0 i.
(20.2)
This is a stationary state; if the quantum state started in an eigentstate, it remains in that eigenstate as long as there is no perturbation. But the eigenvector still ”rotates” in time with frequency ω0 = E0 /¯h in the Hilbert space as indicated schematically in Fig. 20.2. It is called stationary because physical observables of the eigenstate will require not the amplitude, but the inner product, which is hΨt |Ψt i = hΨt0 |Ψt0 i. This is manifestly stationary in time. ˆ t . This Now let us perturb the system with a time-dependent term W perturbation can be due to a voltage applied on a semiconductor device, or electromagnetic waves (photons) incident on a semiconductor.
Fig. 20.1 Fermi’s Golden rule was initially formulated by Dirac; Fermi referred to his own version as the ”Golden rule” No. 2. It is ”golden” because of its wide-ranging use in evaluating rates at which quantum processes occur.
462 Fermi’s Golden Rule
Fig. 20.2 Schrodinger vs. interaction pictures of time-evolution of quantum state. ¨
The new Schrodinger equation for the time evolution of the state is ¨ i¯h
1 A two-level quantum system offers a
rare and famous exactly solvable timedependent perturbation problem, and is the basis for several practical applications ranging from NMR spectroscopy to quantum computation. It is discussed in Section 20.5.
∂ ˆ t ]|Ψt i. |Ψt i = [ Hˆ 0 + W ∂t
In principle, solving this equation will yield the complete future quantum states. In practice, this equation is unsolvable, even for the simplest of perturbations1 . Physically, the perturbation will ”scatter” a particle that was, say in state |0i to state |ni. However, we had noted that even in the absence of perturbations, the eigenvectors were already evolving with time in the Hilbert space. For example, state vector |0i was rotating at an angular frequency ω0 , and state vector |ni at ωn . This is shown schematically in the left of Fig. 20.2. It would be nice to work with unperturbed state vectors that do not change in time, as in the right of Fig. 20.2. This calls for a transformation to a vector space that ”freezes” the time evolution of the unperturbed eigenvectors. Such a transformation is achieved by the relation
| Ψ t i = e −i 2 Fig. 20.2 highlights the difference be-
tween the state vectors |Ψt i and |Ψ(t)i.
(20.3)
Hˆ 0 h¯ t
|Ψ(t)i,
(20.4)
where Hˆ 0 is the Hamiltonian operator. Note that the operator2 now sits in the exponent, but it should not worry us much. We will see that it is rather useful to have it up there. The reason for this nonobvious transformation becomes clear when we put this into the form
20.1
of Schrodinger equation in Equation 20.3 to get ¨ Hˆ 0 Hˆ 0 ∂ Hˆ i ˆ t ]e−i h¯0 t |Ψ(t)i, i¯h − Hˆ 0 e−i h¯ t |Ψ(t)i + e−i h¯ t |Ψ(t)i = [ Hˆ 0 + W h¯ ∂t (20.5) and there is a crucial cancellation, leaving us with i¯h
Hˆ 0 Hˆ ∂ ˆ t e−i h¯0 t ]|Ψ(t)i = W ˆ (t)|Ψ(t)i, |Ψ(t)i = [e+i h¯ t W ∂t
ˆ ( t ) = e +i where W
Hˆ 0 h¯ t
Hˆ 0 h¯ t Hˆ 0 +i h¯ t
ˆ t e −i W
. Can we take the operator e−i Hˆ +i h¯0 t
(20.6) Hˆ 0 h¯ t
from the Hˆ 0
left to the right side as e ? Yes we can, because e · e−i h¯ t = I, the identity operator. The boxed form of the time-evolution in Equation 20.6 is called the interaction picture, as opposed to the conventional form of Equation picture. Note that if there is 20.3, which is called the ”Schrodinger” ¨ ∂|Ψ(t)i ˆ ˆ no perturbation, Wt = 0 =⇒ W (t) = 0 =⇒ i¯h ∂t = 0. Then, |Ψ(t)i = |Ψ(t0 )i, and we have managed to find the state vector representation in which the unperturbed eigenvectors are indeed frozen in time. ˆ t on. Formally, the state vector at Now let’s turn the perturbation W time t in the interaction representation is obtained by integrating both sides of Equation 20.6:
|Ψ(t)i = |Ψ(t0 )i +
1 i¯h
Z t t0
ˆ (t0 )|Ψ(t0 )i, dt0 W
(20.7)
and it looks as if we have solved the problem. However, there is a catch – the unknown state vector |Ψ(t)i appears also on the right side – inside the integral. This is a recursive relation! It reminds of the Brillouin-Wigner form of non-degenerate perturbation theory. Let’s try to iterate the formal solution once: Z Z 0 1 t 0 ˆ 0 1 t 00 ˆ 00 |Ψ(t)i = |Ψ(t0 )i + dt W (t ) |Ψ(t0 )i + dt W (t )|Ψ(t00 )i , i¯h t0 i¯h t0 (20.8) and then keep going: 1 |Ψ(t)i = |Ψ(t0 )i + | {z } i¯h | ∼W 0 1 (i¯h)2 |
Z t t0
ˆ (t0 ) dt0 W
Z t0
Z t t0
ˆ (t0 )|Ψ(t0 )i + dt0 W {z } ∼W 1
ˆ (t00 )|Ψ(t0 )i +... dt00 W {z } t0
(20.9)
∼W 2
We thus obtain a formal perturbation series to many orders. The hope is that the series converges rapidly if the perturbation is ”small”, because successive terms increase as a power law, which for a small
Fermi’s golden rule 463
464 Fermi’s Golden Rule
number gets even smaller. Let’s accept that weak argument now at face value, and we return later to address, justify, and where possible, fix this cavalier approximation. Let |Ψ(t0 )i = |0i be the initial state of the quantum system. The perturbation is turned on at time t0 . The probability amplitude for the system to be found in state |ni at time t(> t0 ) is hn|Ψt i. Note the Schrodinger representation! But the transformation from Schrodinger ¨ ¨ Hˆ 0
En
to interaction picture helps: hn|Ψt i = hn|e−i h¯ t Ψ(t)i = e−i h¯ t hn|Ψ(t)i. This implies |hn|Ψt i|2 = |hn|Ψ(t)i|2 – for all eigenstates |ni. Let us make an approximation in this section and retain only the first order term in the perturbation series. We will return later and discuss the higher order terms that capture multiple-scattering events. Retaining ˆ gives only the terms of Eq. 20.9 to first order in the perturbation W 1 hn|Ψ(t)i ≈ hn|0i + | {z } i¯h =0
Z t t0
dt0 hn|W (t0 )|0i =
1 i¯h
Z t t0
dt0 hn|e+i
Hˆ 0 0 h¯ t
Wt0 e−i
Hˆ 0 0 h¯ t
|0i.
(20.10)
ˆ t = eηt W repreLet us assume the perturbation to be of the form W + senting a ”slow turn on”, with η = 0 , and W = W (r ) a function that depends only on space (Fig. 20.3). If η = 0, then the perturbation is time-independent. But if η = 0+ , then eηt0 → 0 as t0 → −∞. This construction thus effectively kills the perturbation far in the distant past, but slowly turns it on to full strength at t = 0. We will discuss more of the physics buried inside η later. For now, we accept it as a mathematical construction, with the understanding that we will take the limit η → 0 at the end. Then, the amplitude in state |ni simplifies: Fig. 20.3 The perturbation potential is turned on slowly to full strength.
hn|Ψ(t)i ≈
1 i¯h
Z t t0
dt0 hn|e+i | {z e
Hˆ 0 0 h¯ t
+ i En t 0 h¯ hn|
h n |W | 0 i = i¯h
Hˆ 0 0
eηt W e−i h¯ t |0i } | {z }
Z t t0
0 i
dt e
e
E −i 0 t 0 h¯ |0i
En − E0 h¯
t0 ηt0
e
,
(20.11)
and the integral over time may be evaluated exactly to yield Z t t0
dt0 e
i
En − E0 h¯
t0
eηt =
e
i
En − E0 h¯
i
The amplitude then is i
t ηt e
En − E0
t
−e
En − E0 h¯
i
En − E0 h¯
+η
t0 ηt e 0
i
= |{z}
t0 →−∞
i
i
e
En − E0
En − E0 h¯
En − E0 h¯
t ηt e
+η (20.12)
t
h¯ h¯ h n |W | 0 i e eηt e eηt hn|Ψ(t)i ≈ · = h n |W | 0 i · . E0 i¯h ( E0 − En ) + i¯hη i En − + η h¯ (20.13) The probability of the state making a transition from |0i to |ni at time
.
20.2 Oscillating perturbations 465
t is
|hn|Ψt i|2 = |hn|Ψ(t)i|2 ≈ |hn|W |0i|2
e2ηt . ( E0 − En )2 + (h¯ η )2
(20.14)
The rate of transitions from state |0i → |ni is 1 d 2η = |hn|Ψ(t)i|2 ≈ |hn|W |0i|2 e2ηt . τ|0i→|ni dt ( E0 − En )2 + (h¯ η )2 (20.15) Now we take η → 0+ . The third term e2ηt → 1, but we must be careful with the quantity in the round brackets. When η → 0, this quantity is 0, except when the term E0 − En = 0. For E0 = En , the term appears to be in the 0/0 indeterminate form. By making a plot of this function, we can convince ourselves that it approaches a Dirac delta function δ(...) in the variable E0 − En . The mathematical identity 2η limη →0+ x2 +η 2 = limη →0+ 1i [ x−1iη − x+1iη ] = 2πδ( x ) confirms this: in the limit, the term indeed becomes the Dirac delta function. Then, using δ( ax ) = δ( x )/| a|, the rate of transitions is given by 1 τ|0i→|ni
≈
2π |hn|W |0i|2 δ( E0 − En ), h¯
(20.16)
which is the Fermi’s golden rule. The general form of the transition rate is 2π/¯h times the transition matrix element squared, times a Dirac delta function as a statement of energy conservation3 .
20.2 Oscillating perturbations Now suppose the perturbation potential was oscillating in time. Such cases are encountered for an electron interacting with photons, or phonons, both of which by virtue of being waves, present time-varying scattering potentials for the electron. The mathematical nature of such oscillating perturbations with a slow turn-on is Wt = 2Weηt cos(ωt) = eηt W (eiωt + e−iωt ),
(20.17)
which leads to a |0i → |ni transition amplitude Z t Z t En − E0 +h¯ ω 0 En − E0 −h¯ ω 0 h n |W | 0 i t ηt0 t ηt0 0 i 0 i h¯ h¯ hn|Ψ(t)i ≈ dt e e + dt e e . i¯h t0 t0 (20.18) Similar to Equations 20.12 and 20.13, evaluating the integral with t0 → −∞, we get the amplitude for transitions E − E +h¯ ω E − E −h¯ ω i n h¯0 t ηt i n h¯0 t ηt e e e e . hn|Ψ(t)i ≈ hn|W |0i · + ( E0 − En + h¯ ω ) + i¯hη ( E0 − En − h¯ ω ) + i¯hη (20.19)
3 See exercises of this chapter for a col-
lection of solved examples, and unsolved ones for you to practice. Applications of time-dependent perturbation theory abound in semiconductor physics and are found generously sprinkled throughout following chapters.
466 Fermi’s Golden Rule
The probability of transition is then h
|hn|Ψ(t)i|2 ≈ |hn|W |0i|2 · e2ηt e2ηt + + 2 2 ( E0 − En + h¯ ω ) + (h¯ η ) ( E0 − En − h¯ ω )2 + (h¯ η )2 e2iωt e2ηt + ( E0 − En + h¯ ω + i¯hη )( E0 − En − h¯ ω − i¯hη ) i e−2iωt e2ηt , ( E0 − En + h¯ ω − i¯hη )( E0 − En − h¯ ω + i¯hη )
and the rate of transition is then d |hn|Ψ(t)i|2 ≈ |hn|W |0i|2 × dt h 2ηe2ηt 2ηe2ηt + + 2 2 ( E0 − En + h¯ ω ) + (h¯ η ) ( E0 − En − h¯ ω )2 + (h¯ η )2 2(η + iω )e2iωt e2ηt + ( E0 − En + h¯ ω + i¯hη )( E0 − En − h¯ ω − i¯hη ) i 2(η − iω )e−2iωt e2ηt . ( E0 − En + h¯ ω − i¯hη )( E0 − En − h¯ ω + i¯hη )
(20.20)
(20.21)
Notice that the last two (interference) terms are a complex conjugate pair, which they must be, because the rate of transition is real. The sum is then 2× the real part of either term. After some manipulations, we get d |hn|Ψ(t)i|2 ≈ hn|W |0i|2 e2ηt · dt 2η 2η + [1 − cos(2ωt)] + ( E0 − En + h¯ ω )2 + (h¯ η )2 ( E0 − En − h¯ ω )2 + (h¯ η )2 E0 − En + h¯ ω E0 − En − h¯ ω 2 sin(2ωt) − . ( E0 − En + h¯ ω )2 + (h¯ η )2 ( E0 − En − h¯ ω )2 + (h¯ η )2 (20.22) Note that the rate has a part that does not oscillate, and another which does, with twice the frequency of the perturbing potential. If we average over a few periods of the oscillation, hcos(2ωt)it = hsin(2ωt)it = 0. Then, by taking the limit η → 0+ in the same fashion as in Equation 20.16, we obtain the Fermi’s golden rule for oscillating perturbations: 1 2π ≈ × |hn|W |0i|2 × [δ( E0 − En + h¯ ω ) + δ( E0 − En − h¯ ω )]. | {z } | {z } τ|0i→|ni h¯ absorption
4 Typical transition rates for electrons
due to interaction with light (photons) is 1/τ ∼ 109 /s, or it takes ∼ ns for an optical transition. The same for phonons is 1/τ ∼ 1012 /s; it takes ∼ps for thermal transitions. These processes are discussed in following chapters.
emission
(20.23) The Dirac delta functions now indicate that the exchange of energy between the quantum system and the perturbing field is through quanta of energy: either by absorption, leading to En = E0 + h¯ ω, or emission, leading to En = E0 − h¯ ω. The rates of absorption and emission are the same. Which process (emission or absorption) dominates depends on the occupation functions of the quantum states4 .
20.3 Transitions to continuum 467
20.3 Transitions to continuum The Fermi golden rule results of Equations 20.16 and 20.23 are in forms that are suitable for tracking transitions between discrete, or individual states |0i and |ni. For many situations encountered in semiconductors, these transitions will be between states within, or between energy bands, where a continuum of states exist. In those cases, the net transition rate will be obtained by summing over all relevant states. Even the transition between manifestly discrete states – for example the electron ground state of hydrogen atom to the first excited state – by the absorption of a photon – occurs by the interaction between the discrete electron states and the states of the electromagnetic spectrum, which form a continuum. As an example, consider the transitions between electron states in the conduction band due to a point scatterer in a 3D semiconductor (see Fig. 20.4). Let us say the point scatterer potential is W (r ) = V0 δ(r), with V0 in units of eV·m3 . This is not an oscillating potential, so we use the golden rule result of Equation 20.16. We first find the matrix element between plane-wave electron states |ki and |k0 i: ! ! 0 Z e−ik ·r e+ik·r V 0 3 √ hk |V0 δ(r)|ki = d r V0 δ(r) √ = 0. (20.24) V V V Here we have used the property that the Fourier transform of a Dirac delta function is equal to 1. Notice that the matrix element for scattering of plane waves is the Fourier-transform of the scattering potential, a property referred to as the Born approximation. The physical meaning is that the perturbation potential W (r ) can scatter electrons a R 3 iqfrom 0 1 · r state k to a state k only if its Fourier transform Wq = V d re W (r ) 0 has a non-zero coefficient at q = k − k . Then, the transition (or scattering) rate to any final state |k0 i is 1 2π = τ (|ki → |k0 i) h¯
V0 V
2
δ( Ek − Ek0 ).
Fig. 20.4 Illustration of scattering of a conduction electron by a perturbation potential W (r ) in real space, and the same process illustrated in the k-space for the electron bandstructure. For an elastic scattering event, there are a continuum of available final states.
(20.25)
The net scattering ”out” of state |ki into the continuum of states |k0 i is then given by 1 = τ (|ki)
∑0 k
1 2π = τ (|ki → |k0 i) h¯
V0 V
2
∑0 δ(Ek − Ek0 ), k
|
{z
G ( Ek )
(20.26)
}
where we note that the sum over final states of the Dirac delta function is the density of states5 G ( Ek ) in units eV−1 of the electron at energy Ek . This procedure illustrates an important result – the scattering rate from a single initial state into a continuum of states is proportional to the density of available final states. The strength of scattering increases as the square of the scattering potential. The occurrence of
5 Note here that G ( E ) must not include k
the spin-degeneracy for scattering events that cannot flip the spin of an electron.
468 Fermi’s Golden Rule
6 For typical scattering center densities of
nsc ∼ 1017 /cm3 , the scattering rate is of the order of 1/τ ∼ 1012 /s, or a scattering event occurs every ps. Of course this depends on the scattering potential and other parameters, and there are also several types of scattering rates, as is investigated in following chapters on Boltzmann transport (Chapter 21), phonons (Chapter 22), and mobility (Chapter 23).
the (volume)2 term in the denominator may be disconcerting at first. However, the macroscopic volume (or area, or length) terms will for most cases cancel out because of purely physical reasons. For example, for the problem illustrated here, if instead of just one point scatterer, we had N, the density R of scatterers is nsc = N/V. Together with the conversion ∑k0 → V d3 k0 /(2π )3 , we obtain Z 1 2π V0 2 d3 k 0 2π 2 = nsc V δ( Ek − Ek0 ) = V nsc g( Ek ). (2π )3 τ ( Ek ) h¯ V h¯ 0 V (20.27) Here the density of states g( Ek ) is per unit volume, in units 1/(eV.m3 ), as is standard in semiconductor physics. The scattering rate is linearly proportional to the density of scatterers. What is not immediately clear is how can we capture the effect of N scatterers by just multiplying the individual scatterer rate by N. This can be done if the scatterers are uncorrelated, as will be discussed in Chapter 23. For now, note that the macroscopic volume has canceled out, as promised6 .
20.4 Kubo–Greenwood formula Fermi’s golden rule helps find the electronic response of any solid to an oscillating electric field F cos(ωt) as shown in Fig. 20.5. Electrons in the solid respond to the electric field by carrying a current that is linearly proportional to the field, with σ(ω ) the electronic conductivity as the proportionality constant. The rate of energy absorbed and dissipated in the solid is 12 σ (ω ) F2 . This must be due to the difference of downward and upward electronic transition processes as indicated in the figure. Using this idea, the Kubo–Greenwood formula expresses the macroscopic electronic conductivity σ(ω ) of the solid in terms of the microscopic electronic density of states g( E). The method hinges on energy conservation in the transitions. Consider the density of states g( E) of the allowed electron eigenvalues in the solid, with an occupation function f ( E). Writing the electric field as 12 F (eiωt + e−iωt ), the perturbation potential experienced by the electrons is W ( x, t) = 12 qFx (eiωt + e−iωt ). Using the golden rule (Equation 20.23), the rate of an upward transition from state |i i → | f i is 1 2π 1 = |h f | qFx |i i|2 δ( E f − ( Ei + h¯ ω )) τup (i → f ) h¯ 2
πq2 F2 |h f | x |i i|2 δ( E f − ( Ei + h¯ ω )). (20.28) 2¯h The matrix element h f | x |i i is dimensionally a length, and is called a dipole matrix element. It is sometimes preferable to trade off x with ∂ the derivative ∂x to get the closely related momentum matrix element 2 ∂ ∂ h f | − i¯h ∂x |i i. This is achieved using the property [ x, Hˆ 0 ] = mh¯ e ∂x which was derived in Equation 9.38 for an unperturbed Hamiltonian Hˆ 0 . Because the initial states are eigenstates of the unperturbed Hamiltonian
=
Fig. 20.5 The linear response of a solid subjected to an oscillating electric field F cos(ωt) is its electronic conductivity σ(ω ). The Kubo–Greenwood formula relates σ (ω ) to the electronic density of states g( E) of the solid by the application of Fermi’s golden rule.
20.4
Hˆ 0 |i i = Ei |i i and Hˆ 0 | f i = E f | f i, this leads to
∂ me me ˆ |i i |i i = 2 h f |[ x, Hˆ ]|i i = 2 h f | x Hˆ − Hx ∂x h¯ h¯ me ( Ei − E f ) me = 2 (h f | x Hˆ |i i − h f | Hˆ x |i i) = h f | x |i i |{z} | {z } h¯ h¯ 2
hf|
Ei |i i
Ef h f |
h¯ ∂ h f | | i i. (20.29) me ω ∂x R ∂ ∂ Writing the matrix element as D = h f | ∂x |i i = d3 x · ψE+h¯ ω ∂x ψE , E f = Ei + h¯ ω =⇒ h f | x |i i = −
1 = τup
1 πq2 h¯ 2 = F | D |2 δ( E f − ( Ei + h¯ ω )) τup (i → f ) 2m2e ω 2 1
∑ τup (i → f
f)
=
πq2 h¯ 2 F | D |2 δ( E f − ( Ei + h¯ ω )) 2m2e ω 2 ∑ f
1 πq2 h¯ V 2 1 πq2 h¯ V 2 2 = · F | D | · g ( E + h ¯ ω ) and = · F | D |2 · g ( E ). τup τdown 2m2e ω 2 2m2e ω 2 (20.30) Note that the net upward transitions from a single state Ei = E to all possible final states E f = E + h¯ ω is proportional to the number of final states g( E + h¯ ω )V in a unit energy interval. For the same reason, the rate of downward transitions is proportional to g( E)V. For energy conservation, we must also sum over all possible initial states. Because of the Pauli exclusion principle, the transitions actually occur if the initial states are filled and final states are empty. Energy conservation per unit time then gives us the relation
Pup = 2 · h¯ ω · Pdown = 2 · h¯ ω ·
Z
Z
1 σ (ω ) F2 = Pup − Pdown , where 2
dE ·
dE ·
1 · f ( E)[1 − f ( E + h¯ ω )] · g( E), and τup 1 · f ( E + h¯ ω )[1 − f ( E)] · g( E + h¯ ω )
τdown
Z
πq2 h¯ V 2 · F · dE| D |2 · 2m2e ω 2 f ( E)(1 − f ( E + h¯ ω )) − f ( E + h¯ ω )(1 − f ( E)) · g( E) g( E + h¯ ω ) Pup − Pdown = 2 · h¯ ω ·
1 πq2 h¯ 3 V 2 σ (ω ) F2 = ·F · 2 m2e 2πq2 h¯ 3 V =⇒ σ(ω ) = m2e
Z
Z
dE| D |2 g( E) g( E + h¯ ω )[
dE| D |2 g( E) g( E + h¯ ω )[
f ( E) − f ( E + h¯ ω ) ] h¯ ω
f ( E) − f ( E + h¯ ω ) ]. h¯ ω (20.31)
The final boxed form is the Kubo–Greenwood formula. It provides a method to compute the electronic conductivity σ(ω ) from the elec-
Kubo–Greenwood formula 469
470 Fermi’s Golden Rule
tronic density of states g( E) in the solid, and the occupation function f ( E). As an example, for near DC excitation ω → 0, the term in the ∂f square brackets becomes − ∂E , the derivative of the occupation function. If the occupation function follows the Fermi–Dirac distribution, ∂f at low temperatures − ∂E ≈ δ( E − EF ), the electronic conductivity be2πq2 h¯ 3 V
7 It
is important to note that the Kubo–Greenwood formula provides a path to understanding the electrical conductivity of any solid for which the DOS g( E) is known. An example is amorphous solids, for which the whole framework of periodicity and Bloch theorem fails. For example this formalism explains why amorphous SiO2 is insulating. 8 Chapter 24, Section 24.15 discusses the
Kubo formalism for tunneling transport.
comes σ(0) ≈ | D |2av g( EF )2 . N. Mott and D. J. Thouless have m2e shown that for normal band-transport of extended electronic states the Kubo–Greenwood formula yields the well-known electrical connq2 τ ductivity σ = m? as discussed in Chapter 21 on Boltzmann transport. Exercise 20.9 takes you through this derivation7 . But the utility of this formula extends to cases where Boltzmann transport fails, such as in disordered or amorphous solids, in the presence of localization, and also as a unified response to both DC and high frequency fields. For example, σ (ω ) can also model interband optical absorption. Surprisingly, even conductivity resulting from very high electric fields such as interband tunneling in semiconductors can be computed using this formula in far more generality than several other approaches8 . The Kubo–Greenwood formula is an example of fluctuation-dissipation theorems of statistical mechanics that relate the microscopic fluctuations or ”noise” quantitatively to energy diskb T sipation. Another example is the Einstein relation D µ = q , which relates the fluctuations in diffusive motion via the diffusion constant D to the dissipation due to drift via the mobility µ.
20.5 Decoherence in qubits
Fig. 20.6 Isidore Rabi pioneered magnetic resonance and molecular beam techniques. He developed the exact solutions of the 2-level quantum systems and the oscillations are named after him. Rabi was awarded the 1944 Nobel Prize in Physics.
In Chapter 5 we discussed a few exact solutions of the Schrodinger ¨ equation in time-independent potentials, such as the particle in a box, the harmonic oscillator, and the hydrogen atom. Though exact solutions are rare in time-dependent perturbation theory, there is one problem of tremendous practical importance where such a solution does exist: a 2-level quantum system in an oscillating electric field (see Fig. 20.6). This problem has been a source of a stream of new ideas in the past century. The latest practical manifestation of this system is as a quantum bit, or a qubit. The time evolution of a quantum state formed by the superposition |Ψ(t)i = ∑n cn (t)|ni of a large number of states in the interaction picture is given by Equation 20.6 as W11 W12 eiω12 t W13 eiω13 t · · · c1 ( t ) c1 ( t ) iω21 t W22 W23 eiω23 t · · · d c2 (t) W21 e c2 ( t ) i¯h c3 (t) = W eiω31 t W eiω32 t . c ( t ) W33 · · · 3 32 dt 31 .. . .. .. .. .. .. . . . . . (20.32) The matrix defines coupled differential equations that correlate all coefficients while preserving ∑n |cn (t)|2 = 1 at all times. For a time-
20.5 Decoherence in qubits 471
Fig. 20.7 (a) Rabi oscillations in a 2-level quantum system. It models a qubit, which could be formed from 2 discrete electron eigenvalues in a semiconductor quantum dot, and a photon trapped between mirrors. (b) If either the 2 levels are not decoupled from the crystal, or if the photon escapes the cavity, the qubit state decays and loses its quantum information processing capacity.
ˆ (t), the Wnm in the matrix are also timedependent perturbation W dependent. For a 2-level quantum system, the matrix reduces to d c1 ( t ) W11 W12 eiω12 t c1 (t) . (20.33) i¯h = c2 ( t ) W21 eiω21 t W22 dt c2 (t) For the two unperturbed levels, Hˆ 0 |1i = E1 |1i and Hˆ 0 |2i = E2 |2i. The perturbing potential from the oscillating electric field is W (r, t) = qE0 x (eiωt + e−iωt ). |W12 | = |h2|qE0 x |1i| = γ is the magnitude of the matrix element. Let the 2-level system have one electron. If at time t = 0 state |1i is occupied, and state |2i is empty, c1 (0) = 1 and c2 (0) = 0. In Exercise 20.10, you will show that then, the time evolution of the states is characterized by Rabi oscillations with probabilities
|c2 (t)|2 = C2 sin2 Ωt and |c1 (t)|2 = 1 − C2 sin2 Ωt = 1 − |c2 (t)|2 , r ( γh¯ )2 γ ω − ω21 2 2 where C = γ ( )2 + ( ) . ω −ω21 2 and Ω = 2 h ¯ 2 ( h¯ ) + ( 2 ) (20.34) Fig. 20.7 shows the Rabi oscillations of the qubit. The electron absorbs the energy of the oscillating electric field (the photon) and transitions to the upper state in the first half of a cycle. If the photon energy h¯ ω is ”tuned” to the energy level difference h¯ ω21 = E2 −h¯ E1 , the transition can be complete because C2 = 1, and |c2 (t)|2 can reach unity as seen in Figure 20.7 (a). In the next half of the cycle, the electron makes a downward transition and it emits the photon back. This cycle of absorption and emission is repeated. The superposition state vector for the qubit |ψi = c1 (t)|1i + c2 (t)|2i is pictured as a vector on a Bloch sphere. In Rabi oscillations, the tip of this state vector traverses a spiral trajectory going from the north to south pole, and back in a continuous fashion.
Fig. 20.8 Rabi resonance quality.
472 Fermi’s Golden Rule
Fig. 20.9 Decoherence of qubits due to the effect of environment and energetically close eigenvalues. That even topological protection does not make qubits immune to decoherence is explained by Fermi’s golden rule.
Fig. 20.10 Practical realizations of qubits using semiconductors. So that thermal vibrations do not cause unintended transitions between the states E1 and E2 , qubits are operated at temperatures k b T > ∆Ec ? (b) Show now using Fermi’s golden rule that the net rate of reflection is the same as the E >> ∆Ec approximation of the exact relation of part (a). Follow though the steps since each illustrates the scope and the limitations of the Fermi’s golden rule approach. (20.6) Stuck in a Dirac delta well: light to the rescue! An electron is stuck in the single bound state of a 1D Dirac delta potential well, at an energy Eδ below zero. A light beam creates an electric potential W ( x ) = F0 x cos(ω0 t) for the electron. (a) Show that the rate at which an electron is ejected 3√ 2¯h F02 | Eδ | 2 h¯ ω0 −| Eδ | from the well is given by 1/τ = . m (h¯ ω )4 e
Solution: A 1D Dirac delta well famously has a single bound state (see Equation 5.87) at an energy 2 2 2 | Eδ | = me S2 = h¯2mκe below zero, and an exponen2¯h √ −κ | x | tially decaying eigenfunction ψδ ( x ) = κe . Let us approximate the final state to be a free plane wave blissfully unaware of the presence of the well with eigenfunction ψ f ( x ) = √1 eikx and energy 2 2
L
E f = h¯2mke . Here L is an artificial large length inside which the well isR located. Then the matrix +∞ element is h f |W |i i = −∞ dx · ψ?f ( x )W ( x )ψδ ( x ) =
−√
3 2ikκ 2
F0 . L ( k 2 +κ 2 )2
Calculate and confirm this, using
the fact that in W ( x ) = F20 x (eiω0 t + e−iω0 t ), only the absorption term can remove an electron from the bound state into the continuum; there is no emission. From Fermi’s golden rule, the scattering 1 2 rate then is τi→ = 2π ¯ ω0 )), h¯ |h f |W |i i| δ ( E f − ( Ei + h f which when summed over all final k-states R +∞ R using 1 2π 2 κ 3 +∞ dk · · 4F ∑k (...) = −∞ dk 2π (...) gives τ = h 0 −∞ ¯ L 3√ 2¯h F02 | Eδ | 2 h¯ ω0 −| Eδ | k2 h¯ 2 k2 δ( 2me − (h¯ ω0 − | Eδ |)) = . ( k 2 +κ 2 )4 m (h¯ ω )4 e
(b) Show that the maximum rate of √ electron ejec3 3¯h F02 tion for a given photon energy is 8m (h¯ ω )2 , whereas e
for a given bound state energy it is
0√ 343 7¯h F02 . 2048me | Eδ |2
(c) Electrons trapped in deep level states in the bandgap of a 1D semiconductor quantum wire can be optically excited into the conduction band in this fashion. Consider Nt ∼ 105 /cm deep traps in a quantum wire, | Eδ | = 1 eV below the conduction band edge in a wide bandgap semiconductor quantum wire. Find and plot the generation rate of
electrons in G in (cm·s)−1 as a function of the photon energy h¯ ω0 and electric field amplitude F0 /q. (20.7) Scattering rate due to a large number of randomly distributed scattering sites Instead of one scatterer of potential W0 (r ) centered at the origin, how would the net scattering rate depend if there were N scatterers distributed randomly at r = Ri for i = 1 to N, with a net scattering potential W (r ) = W0 (r − R1 ) + W0 (r − R2 ) + ... + W0 (r − R N )? Show that τ1 is proportional not to N 2 but to N. This is encountered in ionized impurity scattering, and is discussed in Chapter 23, and leads to the experimental result that the ionizedimpurity limited electron mobility in a semiconductor reduces as µimp ∝ 1/Nimp . (20.8) Optical transition rate of a hydrogen atom As a model of transition from bound to free states, consider an electron in the ground state of a hydrogen atom in its initial state. The energy is E1s = − Ry where Ry is the Rydberg, and the wavefunction is ψ1s (r ) = √ 1 3 exp[− arB ], where a B is πa B2
the Bohr radius. Assume that a light beam creates the perturbation potential W (z, t) = qFz cos ωt. (a) Find the matrix element for the transition from the ground state to a free-electron final state with energy
h¯ 2 k2f 2me
and wavefunction ψ f (r ) = √1 eik·r . V
(b) Sketch the shape of the squared matrix element. For wavelengths of the final state is the matrix element maximized? (c) Summing over all possible final states using the Fermi-golden rule, calculate and estimate numerically the ionization rate of the hydrogen atom for F = 1 kV/m. (20.9) The Kubo–Greenwood conductivity formula In this problem, we show that the Boltzmann result for metallic conductivity σ = nq2 τ/me can be obtained from the Kubo–Greenwood formula that is derived from Fermi’s golden rule. It was shown in the chapter why the low-temperature DC conductivity from the Kubo–Greenwood formula 2πq2 h¯ 3 V
is σ(0) ≈ m2e factor of unity.
| D |2av g( EF )2 ≈ nq2 τ/me up to a
(a) Show that the averaged matrix element square πλ | D |2av ≈ 3Vm f p , where the mean free path λm f p = v F τ is the product of the Fermi velocity v F and a scattering time τ.
480 Exercises
Solution: The in Fermi’s golden R matrix element ∂ rule is D = d3 x · ψE+h¯ ω ∂x ψE . Consider the initial and final states to be plane waves ψE (r ) = √1 eik·r V
(b) Using the value of | D |2av from part (a) and the metallic density of states g( EF ), prove that the DC conductivity is indeed σ(0) = nq2 τ/me up to a factor of unity.
and ψE (r ) = √1 eik ·r . The matrix element is R 3V i(k−k0 )·r (c) D. J. Thouless took a different approach (see The then D = ik d re , where the integral is V Philosophical Magazine, volume 32, Issue 2, pages over the entire volume V. If there is a scattering 877-879, 1975) to derive the entire AC response event every τ second for electrons at the Fermi from the Kubo–Greenwood formula as surface moving with velocity v F , the mean free 1 σ (0) path is λm f p ∼ v F τ. As a result of scattering, . (20.46) σ ( ω ) = σ (0) · = 1 2 2 1 + ω2 τ2 the wavevector k is randomized over a propa2 me ω λm f p 1 + 2 2 h¯ k F gation length λm f p , and volume v = 43 πλ3m f p . 2me There are N = V/v such volumes in the crystal. R R 0 0 Identify τ in the AC damping of conductivity with The integral V d3 rei(k−k )·r = v d3 rei(k−k )·r1 + R 3 i(k−k0 )·r R 3 i(k−k0 )·r the scattering time. Read the paper and note the 2 ... + N is thus a sum d re d re v v rather interesting approach taken in it. of N terms, with uncorrelated, randomized phases because of scattering. If scattering occurs reason- (20.10) Rabi oscillations 0 In this Exercise, you solve the 2-level quantum sysably often, the volume v is small,R ei(k−k )·r1 ≈ eiφ1 0 tem problem exactly to recover the results of Rabi is constant within volume v, and v d3 rei(k−k )·r1 ≈ oscillations. veiφ1 . Then, D = ik · Vv (eiφ1 + eiφ2 + ... + eiφN ), and | D |2 = k2 ( Vv )2 N = Vv k2 , where because of (a) Prove that the solution to the matrix Equation the random phases and large N, the cross terms 20.33 is given by Equations 20.34. cancel. This sort of summation is characteristic of a random walk or diffusive process. We (b) Make numerical plots of the solutions as shown must average this | D |2 over a volume v and not in Fig. 20.8 for h¯ ω21 = 1.0 eV and for values of V. Assuming the scattering to be nearly elastic, θ 0 γ = 0.01, 0.1, and 1.0 eV. (k − k ) · r ≈ 2k sin( 2 )λm f p ≈ kθλm f p ≤ 1, meaning the small angular deviation is θ ≤ 1/(kλm f p ). (c) Do some research on the range of physical probAveraging over the small angle cone yields | D |2av = R 1/kλm f p lems for which the exact 2-level solution has enπλ sin ( θ ) dθ m f p v 2 0R Fτ ∼ Vv · 4λ12 = 3V = πv . 2π abled practical applications, and write a short reVk 3V sin(θ )dθ mf p 0 port on its importance in the past and in the future. 0
No Turning Back: The Boltzmann Transport Equation
21
We have achieved a lot in the past chapters by investigating ballistic 21.1 Micro vs. macro transport and paying only scant attention to electron scattering processes in semiconductors. The time has come to discuss scattering and 21.2 The Liouville theorem its deep connection with the concepts of equilibrium, irreversibility, 21.3 Boltzmann transport equation and entropy. The Boltzmann transport equation (BTE) is the frame- 21.4 H-theorem and entropy work to introduce the effect of scattering processes in solids on a com- 21.5 Equilibrium distribution mon platform that bakes in the quantum mechanics of electrons and 21.6 The RTA: time to relax! phonons with thermodynamics and electromagnetism. A unified pic- 21.7 One formula to rule them all ture to understand the flow of energy in solids carried by charge and 21.8 Electrical conductivity heat and their connection will emerge. We will see how the BTE intro- 21.9 Thermoelectric properties duces and enforces irreversibility and entropy into reversible quantum 21.10 Onsager relations 21.11 Conservation laws mechanical transport processes. In this chapter, we learn:
• Why do we need the BTE to understand semiconductors? • How does the BTE help calculate transport properties? • What are the power and limitations of the BTE?
21.1 Micro vs. macro The current I we measure flowing in a semiconductor is a macroscopic value at the contacts. The macroscopic value is what ends up at the boundaries from the sum total of all microscopic currents flowing inside the bulk. The microscopic processes respond to external controls such as voltage bias at contacts, electric field effect through insulators, magnetic fields, temperature gradients, and photonic illumination. In this sense, the current I is a quantity similar to the pressure P of a gas, whose microscopic origin is in the collisions of molecules with the wall of a container and amongst themselves. To connect the microscopic processes of electrons to the measurable macroscopic current, we will be guided by the same kinetic theory that explains the behavior of gases. Let us start with a simple question: an electron is subjected to a constant external electric field F. What are its transport properties? Consider the classical mechanics of an electron starting from rest v(t =
481 483 487 490 493 499 501 506 511 517 518
21.12 Berry curvature correction
520
21.13 Limitations of the BTE
521
21.14 Chapter summary section
523
Further reading
523
Exercises
524
482 No Turning Back: The Boltzmann Transport Equation
0) = 0 at x (t = 0) = 0 in vacuum. The answer is precise: by Newton’s q dv(t) law, −qF = me dt dictates that the electron velocity v(t) = − me Ft increases in magnitude linearly with time t, i.e., the electron undergoes uniform acceleration. Its location and momentum [ x (t), me v(t)] follow a sharply defined, deterministic path in the [ x, p] space with time. How does the electron dynamics change in quantum mechanics? The eigenfunctions and eigenvalues allowed for a free–electron in vac√ uum in the absence of an electric field are ψ(r ) = eik·r / V and E(k) = h¯ 2 k2 /2me . This set does not allow the initial conditions [ x (t = 0) = 0, v(t = 0) = 0] due to the uncertainty principle rooted in the waveparticle duality. However, since the allowed eigenfunctions form a complete set, quasi-localized states can be constructed by the linear h¯ k2
Fig. 21.1 Illustration of the motion of a wavepacket in the phase–space.
1 Solving
the
time-dependent ∂ψ
Schrodinger equation i¯h ∂t = ¨ h¯ 2 2 [− 2m ∇ + V ( r )] ψ, where e V (r ) = −qF · r yields the dynamical solution ψ(r, t). The details of the solution reside in the coefficients ck of the wavepacket.
2 This is valid so long as the electron
stays within the same band and does not transition to another. Such situations are described in Section 21.13 at the end of this chapter. The discussion in this chapter is restricted to relatively weak electric and magnetic fields and no optical excitation such that electron transport occurs in a single band.
combinations ψ(r, t) = ∑k ck ei(k·r− 2me t) , which are wavepackets, such that their center coordinates are well defined. The electric field will then change the center k and r of the electron wavepacket with time. The quantum state will be composed of a range of [r, k] values. In the BTE language, the [r, k] space is called the phase space, illustrated in Fig. 21.1 for 1D. Because there is only one electron, the quantum mechanical wavepacket solution ψ(r, t) must satisfy1 for the general 3D R 3 case d3 r · d2π k 3 · |ψ(r, t)|2 = 1 at all times t. Because there is just one (
L
)
electron, there is no Pauli blocking due to the exclusion principle. The trajectory in the phase space is diffuse compared to the sharp classical path due to the uncertainty principle ∆x · ∆k ≥ 12 . The ”center” of the wavepacket tracks the path given by F = h¯ dk/dt. Now consider the single electron residing in a crystal. The periodic crystal potential V (r + a) = V (r ) rearranges the electron energy eigenvalues into bands En (k ) with corresponding Bloch eigenfunctions ψk (r ) = eikr uk (r ). Fig. 21.2 indicates the single electron in a band. In Chapter 9 we learned that the dynamics of a Bloch wavepacket ψ(r, t) = ∑k ck ψk (r )e−i v(k) =
En ( k ) h¯ t
in the phase space is given by2
dr 1 F d(h¯ k) = ∇k En (k) − × Ωk & = (−q)[E + v g (k) × B], dt ¯| {z } h¯ h dt | {z } v g (k)
Lorentz Force
(21.1) where [r, k] is the center of the Bloch wavepacket in the phase space, indicated by [ x0 , k0 ] in Fig. 21.1, and Ωk is the Berry curvature of the band at wavevector k. For the constant electric field E, with B = 0 and in a crystal for which Ωk = 0, the center of the wavepacket will qE change linearly with time as k(t) = k(0) − h¯ t. If k(0) = 0, the electron starts at the bottom of the band with a zero group velocity. From Fig. 21.2 it can then be seen that the electron group velocity given by dEn (k ) the slope v g (k) = 1h¯ dk will initially increase, reach a maximum near the middle of the band, and then decrease to zero as the state reaches the top of the band, or the Brilllouin zone edge at k = ± πa . It will re-enter the BZ with an opposite velocity direction. The electron will
21.2
then perform oscillations in the k-space and in energy En (k(t)), which translates also to oscillations in the real space. However, these oscillations, predicted by Bloch as a remarkable consequence of quantum mechanics of electrons in periodic crystals (see Problems 8.8 and 9.5), are rarely observed in practice. Why are Bloch oscillations not commonplace? The first reason is because scattering processes in crystals throw the electron off the ballistic trajectory3 . As shown in the bandstructure of Fig. 21.2, an electron can scatter from state k to k0 , at a rate S(k → k0 ), preventing oscillations. Such scattering is only possible by processes that break the periodicity of the crystal potential, such as defects and phonons4 . The effective mass equation already includes the crystal potentials of bulk semiconductors and perfect semiconductor heterostructures. Even if we removed all scattering of the first kind, the second reason Bloch oscillations are not common in real solids is the presence of not just one, but a large number of electrons in the band. The transport mechanism and scattering are significantly affected by the Pauli exclusion principle, and by electron-electron scattering. In typical metals, the density is of the order n ∼ 1023 /cm3 , which fills a significant part of the Brillouin zone, and in a semiconductor, n ∼ 1015 − 1020 /cm3 fill a fraction. These large numbers make it difficult to track the [r, k] of each electron in time using Equations 21.1. Furthermore, to find the macroscopic properties such as currents, we sum over all k-states and throw away the details, making the exercise unnecessary as well! We therefore change our point of view to a far more powerful method for connecting the large numbers in the microscopic picture to the macroscopic transport coefficients such as electrical or thermal conductivity.
21.2 The Liouville theorem An electron wavepacket is also called a quasiparticle to remind us that by itself it does not have a well defined [ x, k ], but the center of the wavepacket does5 . Quantum mechanics is responsible for how the wavepacket came about in the first place: through a linear superposition of Bloch states labeled by k, which gave the wavepacket a finite spread in x. We will refer to an electron wavepacket simply as an ”electron” in a slight abuse of language. Instead of tracking the [ x, k ] of each electron in the semiconductor band, we will end up tracking the probability density f ( x, k, t) of finding a collection of electrons in the neighborhood of [ x, k ] at time t. But let’s make the transition from knowing everything about every electron, to a restricted collective knowledge in steps. The individual [ri , ki ] of each of the i = 1, 2, ..., N electrons define a 6N-dimensional phase space called the Γ-space. Then, the state of the N-electron system at any given time is given by a point in this 6N-dimensional Γ-space. Several distinct points in the Γ-space can correspond to the same macroscopic property of the N-electrons, such as their total energy, or the
The Liouville theorem 483
3 The time the wavepacket needs to tra-
verse the entire BZ is t = h/qFa ∼ 100 ps for a small field F ∼ 1 kV/cm. Typical scattering times are sub-ps. See Exercise 21.1. 4 It is important to distinguish which po-
tentials scatter electrons, and which do not. In Chapter 14 we learned the effective mass approximation with wavepacket eigenfunctions. In a semiconductor heterostructure, the band offsets are already included in the potential. For electrons confined in a heterostructure quantum well with perfect interfaces, in the effective mass approximation a heterojunction band offset is not a scattering potential. Impurities such as dopants or interface roughness will then scatter electrons, not the band offsets.
Fig. 21.2 Scattering of an electron from a state k to k0 at a rate S(k → k0 ) given by Fermi’s golden rule.
5 In this section, we interchangeably use
x and k for the vectors r and k to avoid excessive symbols, but it should be clear from the context when vectors are appropriate.
484 No Turning Back: The Boltzmann Transport Equation
6 The full-blown quantum version tracks
the density matrix ρˆ ( x, k, t) which retains the phase relationships between the electron wavefunctions. The diagonal matrix elements of the density matrix are ρ( x, k, t), the semi-classical probability density of the Liouville equation. For a majority of semiconductor transport phenomena, the density matrix is an overkill, and so is the Liouville density, so we transition in steps from ρ( x, k, t) in the Γ-space to f ( x, k, t) in the micro (or µ-) space, where the BTE is defined.
Fig. 21.3 The Liouville theorem states that the shaded area of the phase space of occupied states is preserved when the dynamics or motion of particles occupying them follow the rules of classical mechanics. The shaded area can deform due to the motion, but will preserve the area. In the absence of scattering, the trajectories of phase points in the Γ-space do not intersect.
macroscopic current carried by all of them. Distinct microscopic states (or points) that lead to the same macroscopic property are called ensembles, and are represented by a set of points in the Γ-space. This enormous 6N-dimensional phase space, and the concept of ensembles was introduced by Gibbs. The Liouville theorem discussed in this section tracks the motion of ρ( x, k, t) = ρ(r1 , r2 , ...r N , k1 , k2 , ...k N , t), the many-electron distribution function, in the Γ-space when subjected to the laws of classical dynamics6 . The number of points representing the electrons in a small neighborhood of [ x, k] in the 6N-dimensional Γ-space at time t is ρ( x, k, t) · dx · dk. The shaded area in Fig. 21.3 is dx · dk. When these electrons are subjected to a force F = −dV/dx due to a potential V, each phase point moves to a new [ x 0 , k0 ] an instant dt later. What is the new area dx 0 · dk0 at time t + dt, i.e., how does the phase–space volume change in time? Note that this is indeed a new point of view, where we track a collective distribution function, rather than an individual electron. The answer to the question posed above was first found by Joseph Liouville (Fig. 21.4), and is called the Liouville theorem of classical mechanics. If there is no scattering, the rather remarkable fact is that the phase–space volume is conserved, meaning dx 0 · dk0 = dx · dk, the area of the shaded regions in Fig. 21.3 remains the same, though it may deform: phase space flows as an incompressible fluid. To see why, let us track the phase space point [ xα , k α ] (where α is a coordinate index) in time. Writing the classical momentum as pα = mvα = h¯ k α , dxα /dt = x˙ α and dpα /dt = p˙ α , the classical Hamiltonian is the energy H = p2 /2me + V, and retaining terms first order in dt, we have ∂ x˙ α dxα dt + O(dt2 ) ∂xα ∂ p˙ α p0α = pα + p˙ α dt + O(dt2 ), and dp0α = dpα + dpα dt + O(dt2 ) ∂pα ∂ x˙ α ∂ p˙ α =⇒ dxα0 · dp0α = dxα · dpα · (1 + dt( + ) + O(dt2 )) ∂xα ∂pα | {z } xα0 = xα + x˙ α dt + O(dt2 ), and dxα0 = dxα +
∆
x˙ α = + Fig. 21.4 Joseph Liouville, French mathematician who recognized that timeevolution in systems governed by Hamiltonian equations preserves the area of phase–space.
∂H ∂H and p˙ α = − ∂pα ∂xα
=⇒
∂2 H
∂ x˙ α ∂ p˙ α ∂2 H =+ & =− ∂xα ∂xα ∂pα ∂pα ∂pα ∂xα
=⇒ ∆ = 0 =⇒ dxα0 · dp0α = dxα · dpα ,
(21.2)
which proves that the phase–space area is preserved in the dynamics. Note that ∆ = 0 is ensured by Hamilton’s equations of classical mechanics shown in the first box, which is simply the velocity ∂H ∂H ∂V x˙ α = + ∂p = mpαe and Newton’s law p˙ α = − ∂x = − ∂x . α α α 0 0 Because of the Liouville theorem, ρ( x , k , t + dt) = ρ( x, k, t), which means that in the absence of scattering events, the state that was at [ x, p] at time t has moved to [ x 0 , p0 ] at time t + dt in a streaming motion.
21.2
The Liouville theorem 485
Taylor expanding ρ with xα0 , p0α , and dt, we can write ∂ρ ∂ρ ∂ρ dx + dp + dt ∂x ∂p ∂t 0 0 dρ ρ( x , k , t + dt) − ρ( x, k, t) ∂ρ ∂ρ ∂ρ = = 0 =⇒ + x˙ + p˙ =0 dt (t + dt) − t ∂t ∂x ∂p
ρ( x + x˙ · dt, p + p˙ · dt, t + dt) = ρ( x, p, t) +
=⇒
∂ρ ∂ρ + v · ∇r ρ + F · ∇p ρ = 0, or equivalently, + ∇ · J = 0, ∂t ∂t (21.3)
where the boxed version is generalized to any dimension with r = (..., xα , ...), v = (..., x˙ α , ...), and p = (..., pα , ...), and J = ρvtot is the current density7 . In this form, we recognize the Liouville equation as a continuity equation, which is the consequence of particle number conservation. An interesting way to write the Liouville theorem is
7 The
current in this compact form is defined with vtot = (r˙ 1 , r˙ 2 , ...˙r N , p˙ 1 , p˙ 2 , ...p˙ N ) a 6N-dimensional vector.
dρ ∂ρ ∂ρ ∂ρ = + ∑( · p˙ α + · x˙ α ) = 0 dt ∂t ∂xα α ∂pα ∂ρ ∂ρ ∂H ∂ρ ∂H = − ∑( · − · ) = −{ρ, H } ∂t ∂pα ∂xα α ∂xα ∂pα
=⇒
∂ρ + {ρ, H } = 0, ∂t
(21.4)
where the v · ∇r ρ + F · ∇p ρ = ∇ · J term of Equation 21.3 is identified as {ρ, H }, the Poisson–bracket of the distribution function ρ with the Hamiltonian H. This formulation carries over nicely to quantum mechanics8 . The expectation value of any macroscopic physical property g( x, p) (such as current, entropy, or energy) defined as a function of the points in the phase–space, is h gi = h gρi for the distribution ρ( x, p, t). The time evolution of its expectation value is then given by dh gi dt = h{ g, H i}. From Equation 21.4, an equilibrium distribution function ρ0 must be independent of time, and therefore must have {ρ0 , H } = 0. If the equilibrium distribution function is an explicit function of the Hamiltonian or the total energy itself ρ0 = ρ0 ( H ( x, p)), then {ρ0 , H } = 0 is a mathematical property of the Poisson bracket. Since ρ( x, k, t) is the probability density, summing or integrating this function over the entire phase space gives the total number of parR ticles at that time: dxdkρ( x, k, t) = N (t). If the total number of particles in the entire phase space does not change with time9 , then R R dxdkρ( x, k, t) = dxdkρ0 ( x, k) = N. Thus the Liouville equation, which is still exact, tracks the relative locations and momenta of the N particle ensembles in the Γ-space via the many-particle distribution function ρ(r1 , r2 , ...r N , p1 , p2 , ...p N , t). Let us now allow for scattering between particles, and ask how may we track the properties of a single particle, say the one with subscript 1? The single-particle distribution function for particle 1 is obtained by integrating the distribution function over the coordinates and momenta
8 The Poisson bracket for any two func-
tions A and B in the phase space [ xα , pα ] in classical mechanics is defined as ∂A ∂B ∂A ∂B { A, B} = ∑α ( ∂x · ∂p − ∂p · ∂x ). Note α α α α that although A and B are not operators, { B, A} = −{ A, B}. The quantum mechanical generalization of the Liouville theorem is the von Neumann formula ∂ρˆ ˆ Hˆ ], for the density matrix, ∂t = h¯i [ρ, where [., .] is the operator commutator.
9 We will see soon that the equilibrium
single-particle distribution function for electrons is the Fermi–Dirac function f 0 = 1/(exp[( E(k, r) − µ)/k b T ] + 1), where E(k, r) = H are indeed the energy eigenvalues of the Hamiltonian. Furthermore, any physical conserved quantity whose expectation value over the distribution does not change with time (called a constant of motion), is required to satisfy the relation { g, H } = 0 because of dh gi the relation dt = h{ g, H }i. This is the origin of the laws of conservation in physics.
486 No Turning Back: The Boltzmann Transport Equation
of all ( N − 1) other particles except particle 1: f (1) ( r 1 , p 1 , t ) = Z
N! × ( N − 1) !
dr2 dr3 ...dr N dp2 dp3 ...dp N · ρ(r1 , ...r N , p1 , ...r N , t).
(21.5)
The factorials in front are necessary because the N particles are assumed identical, and indistinguishable. If we account for inter-particle scattering between particle 1 and 2, the single particle occupation function f (1) (r1 , p1 , t) will depend on the two-particle occupation function f (2) ( r 1 , r 2 , p 1 , p 2 , t ) = 10 This set of coupled equation is known
as the BBGKY hierarchy after Bogoliubov, Born, Green, Kirkwood and Yvon.
11 When the mean free path between
scattering events is smaller than the effective size of a particle, the correlations between particles are strong and cannot be neglected. Then one must resort to molecular dynamics solutions of the Liouville equation, or use alternate formalisms, such as Einstein’s stochastic model for Brownian motion. The effective size for electrons in typical semiconductors is the de Broglie wavelength. When the electron mean free path is much longer than its de Broglie wavelength, the correlations are indeed weak and the BTE is an excellent approximation.
12 The historical name of this crucial ap-
proximation is stosszahlansatz, popularly referred to as the assumption of molecular chaos. It means that a scattering event erases the memory of the phase space points the particles occupied prior to scattering. The quantum analogy is when a many-particle wavefunction is written as the product of individual wavefunctions.
Z
N! × ( N − 2) !
dr3 dr4 ...dr N dp3 dp4 ...dp N · ρ(r1 , ...r N , p1 , ...r N , t),
(21.6)
which in turn will depend on f (3) (r1 , r2 , r3 , p1 , p2 , p3 , t), the 3-particle distribution function, and so on. In the presence of collisions, the collisionless Liouville equation of Equation 21.4 evolves into a hierarchy of N coupled equations10 . This is because the interparticle scattering potentials ”entangle”, or correlate the single-particle distribution functions. For example, the scattering potential V (r1 , r2 ) makes the 2-particle distribution function f (2) ( r 1 , r 2 , p 1 , p 2 , t ) = f (1) ( r 1 , p 1 , t ) · f (1) ( r 2 , p 2 , t ) + g ( r 1 , r 2 , p 1 , p 2 , t ) , (21.7) where the first term on the right is a product of single-particle distributions, but the second term does not factorize into functions for the two particles. Such correlations make the hierarchy of equations unsolvable in the presence of scattering11 . Boltzmann made the following bold assumption: if the inter-particle correlations are weak, the corresponding factor g(r1 , r2 , p1 , p2 , t) ≈ 0 in Equation 21.7, the N-particle distribution function factorizes into a product of single-particle functions: f ( N ) (r1 , ...r N , p1 , ...p N , t) = f (1) (r1 , p1 , t) · f (1) (r2 , p2 , t) · ... · f (1) (r N , p N , t), (21.8) and the hierarchy of Liouville equations truncate into a single equation, in a much lower dimensional phase space12 . The single-particle occupation function f (1) (r1 , p1 , t) is defined in a 6-dimensional [ x, k] space. The N particles then occupy a set of N independent points in this 6-dimensional space, called the µ-space, or the microspace. Note the enormous conceptual simplification in going to the µ-space. We drop the superscript f (1) of the single-particle distribution function and denote it by f ( x, k, t). The Boltzmann transport equation lives in this reduced 6-D microspace. The distribution function of the BTE is thus the single-particle approximation, to which we turn our attention.
21.3
Boltzmann transport equation 487
21.3 Boltzmann transport equation In 1872, Boltzmann (Fig. 21.5) introduced a famous scattering term into the Liouville equation in an effort to explain the behavior of gases. However, his aim was to do more than explain gases: he wanted to derive the then-empirical laws of thermodynamics from the complete and quantitative laws of classical mechanics. Boltzmann succeeded in the most part in achieving these goals through his scattering term. The Boltzmann transport equation has a fascinating legacy that gave birth to the fields of statistical mechanics and quantum theory by introducing the concepts of entropy and irreversibility, and an ”arrow of time”. We are now ready to learn how this came about, and use these ideas for semiconductor electronics and photonics. From Fig. 21.6, to account for scattering events, we must also allow for f ( x, k, t) to increase by scattering ”in” from nearby k0 points on other trajectories, and for it to decrease by scattering ”out”: f = f ( x, k, t) = f ( x − vdt, k −
F dt, t − dt) + (Sin − Sout )dt, h¯
(21.9)
where Sin and Sout are the in- and out-scattering rates, in units of 1/s. Taylor-expanding the first term on the right and rearranging, we get the 1D Boltzmann transport equation ∂f ∂f F ∂f +v + = Sin − Sout . ∂t ∂x h¯ ∂k
(21.10)
From Fig. 21.6 we see that the net in- and out-scattering rates depend on the occupation functions of the initial and final states. Let us denote f ( x, k, t) = f k and S(k → k0 ) as the individual scattering rate13 from state |ki → |k0 i. The net rate of this process is proportional to f k since there must be an electron in state k to scatter out of. The rate must also be proportional to (1 − f k0 ) because if state k0 was filled, another electron cannot scatter into it. Pauli’s exclusion principle for electrons therefore requires the net scattering in and out rates to be Sin = S(k0 → k) f k0 (1 − f k ), and Sout = S(k → k0 ) f k (1 − f k0 ). (21.11) Now to track the occupation function of state |ki, we must allow for in- and out-scattering to all other states |k0 i. Then, using Equations 21.11 in 21.10, the Boltzmann transport equation becomes: ∂ fk ∂f F ∂ fk +v k + ∂t ∂x h¯ ∂k = ∑[S(k0 → k) f k0 (1 − f k ) − S(k → k0 ) f k (1 − f k0 )] . k0
|
{z
Cˆ f k
Fig. 21.5 Ludwig Boltzmann, the father of statistical mechanics and kinetic theory of gases. He formulated the formula for entropy S as a function of the degrees of freedom Ω: this formula S = k b ln Ω, where k b is the Boltzmann constant is inscribed on his gravestone. The concept of entropy permeates all branches of sciences, including communication systems. Boltzmann ended his life tragically by hanging himself in 1906.
}
(21.12)
The left hand side called the flow term is identical to the Liouville equation. The right-hand side is called the collision integral and formally denoted as Cˆ f k . It accounts for all types of scattering and
13 The individual scattering rate is found
using Fermi’s golden rule: S(k → 2 k0 ) = 2π ¯ ω ), where h¯ |Wk,k0 | δ ( Ek − Ek0 ± h Wk,k0 = hk0 |W (r )|k i is the matrix element of the scattering potential W (r ) between states |ki and |k0 i as was discussed in Chapter 20.
488 No Turning Back: The Boltzmann Transport Equation
Fig. 21.6 Connecting the picture of scattering within the band of a semiconductor, with the picture of scattering in the phase–space. The Boltzmann collision integral accounts for in- and out-scattering from the Liouvillean ballistic trajectories of the wavepackets. At equilibrium, f ( x, k, t) is the Fermi–Dirac distribution function. Its non-equilibrium variation in the presence of scattering and external fields is obtained by solving the BTE. A scattering event from k → k0 occurs at the same point in real space.
14 Knowledge of f gives us near magical k
powers, because we can then easily calculate the macroscopic charge (or heat, spin, ...) transport currents due to the electron via sums of the type ∑k φk f k , where φk is the physical quantity of interest, as a result of all microscopic scattering events! f k connects the microscopic processes to the macroscopic observables, and is found by solving the BTE.
collisions, Rand the summation over all k0 is converted to an integral dk0 (...). Solving the BTE in Equation 21.12 in principle ∑k0 (...) → (2π/L ) gives us f k = f ( x, k, t), the occupation probability of the wavepacket centered at |ki and x at time t in response to external electric fields or potential gradients, in the presence of all sorts of scattering events in the crystal14 . But the BTE in its raw form is not that straightforward to solve in the most general case, because f k also appears in a tangled form inside the sum (or collision integral) on the right. This is why the BTE is called an integro-differential equation. Before we start seeking solutions which will reveal a rich spectrum of phenomena, let us first generalize the BTE to higher dimensions by writing it as ∂ fk F + vk · ∇r f k + · ∇k f k = Cˆ f k , ∂t h¯
15 The internal forces due to the periodic
atomic potential are already considered in the energy bandstructure E(k), which is buried in f k . The term F is therefore only due to external forces such as the Lorentz force due to electric and magnetic fields.
(21.13)
where on the left side, vk = 1h¯ ∇k E(k) is the group velocity vector of state k, and F is the external force acting on the electrons15 . The collision integral on the right can be visualized using Fig. 21.7 (a), and classified into the three types of scattering events in that figure. Fig. 21.7 (b) shows elastic scattering, when |k| = |k0 |, and Ek = Ek0 . An example of elastic scattering is when an electron in state k scatters from a charged dopant ion in the semiconductor into a state k0 . Now energy and momentum are always conserved in every scattering event for the whole system. But since the mass of the ion M is much larger than the electron me , me ∆ve ∼ M∆vion =⇒ ∆vion ∼ mMe ∆ve ≈ 0. This implies that to a very good approximation, the heavy ion can change the momentum of the electron without changing the electron’s kinetic energy, making it an elastic process for the electron. The collision term for elastic scattering that tracks the occupation of state k is then the
21.3
Boltzmann transport equation 489
difference of the incoming and outgoing fluxes Cˆ f k =
∑0 [S(k0 → k) f k0 (1 − f k ) − S(k → k0 ) f k (1 − f k0 )] k
S(k0 → k) = S(k → k0 ) =⇒ Cˆ f k = | {z } elastic scattering
∑0 S(k0 → k)( f k0 − f k ). (21.14) k
2 The equality S(k0 → k) = S(k0 → k) = 2π h¯ |Wk,k0 | δ ( Ek − Ek0 ) is guaranteed by Fermi’s golden rule for elastic processes. This is because the scattering matrix element hk0 |W (r)|ki = (hk|W (r)|k0 i)? , implying |Wk,k0 |2 = |Wk0 ,k |2 , and for the Dirac delta function, δ(+ x ) = δ(− x ) is an identity. Note that the sum (or integral) is over all k0 . The collision term for inelastic scattering (Fig. 21.7 (c)) is
Cˆ f k =
k
S(k → k0 ) = 0
S(k → k) =
∑0 [S(k0 → k) f k0 (1 − f k ) − S(k → k0 ) f k (1 − f k0 )] ∑[S| (k, nq →{zk0 , nq + 1}) + S| (k, nq →{zk0 , nq − 1})] q
emission ∝ nq +1
absorption ∝ nq
∑[S| (k , nq →{zk, nq + 1}) + S| (k , nq →{zk, nq − 1})]. 0
q
emission ∝ nq +1
absorption ∝ nq
(21.15)
Here, S(k, nq → k0 , nq + 1) is the scattering rate of an electron from state k to state k0 by the emission of a phonon. The energy Ek0 = Ek ± h¯ ωq is conserved for such emission and absorption processes for the combined system, and so is the momentum h¯ k0 = h¯ k ± h¯ q. Because of the emission, the number of phonons nq in mode q before scattering increases by one quantum, to nq + 1 after the scattering event. In Chapter 16, Equation 16.18 we proved that the rates for emission and absorption are not the same, but related by the Maxwell–Boltzmann factor S(k → k0 ) = S(k0 → k) × exp[( Ek − Ek0 )/k b T ]. Note that though in Equation 21.15 we have written out both emission and absorption processes for in- and out-scattering from state k → k0 , only one of them is allowed for a given choice of Ek , Ek0 since either Ek > Ek0 or Ek < Ek0 for inelastic scattering, and only one pair survives. Using these properties, the collision integral for inelastic scattering may be written as Cˆ f k =
∑0 S(k → k0 )[e k
Ek − E 0 k kb T
f k0 (1 − f k ) − f k (1 − f k0 )],
(21.16)
which we recognize as correct also for elastic scattering, which is the special case when Ek = Ek0 . For inelastic scattering, implicit in S(k → k0 ) is the sum ∑q (...) over phonon modes as seen in Equation 21.15. For electron-electron scattering, we have from Fig. 21.7, Cˆ f k =
∑
k1 ,k0 ,k01
[S(k0 , k01 → k, k1 ) f k0 f k0 (1 − f k )(1 − f k1 ) 1
−S(k, k1 → k0 , k01 ) f k f k1 (1 − f k0 )(1 − f k0 )], 1
(21.17)
Fig. 21.7 Illustration of the scattering term of the Boltzmann equation (a), and the three types of scattering events: (b) electron-defect, (c) electron-phonon, and (d) electron-electron scattering.
490 No Turning Back: The Boltzmann Transport Equation
16 The Boltzmann equation captures ki-
netic phenomena in the ”dilute” limit when nd3 > 1, which is called the Vlasov equation, used in the study of plasmas. The collisions in BTE follow the rules of quantum mechanics, and conserve energy and momentum on the whole and are in this sense ”deterministic”. The BTE-like variants: Langevin and Fokker–Planck equations introduce fictitious mathematical randomizing processes without regard to the physical scattering mechanisms. These models are used to study noise and fluctuations in semiconductors, but we do not discuss them further.
17 Note that Boltzmann’s original anal-
ysis was not for electrons, but for molecules of a gas, following the laws of classical mechanics. Neither the electron, nor their quantum mechanics was known at the time! Boltzmann’s arguments have been adapted in the discussion of this section for electrons and their complete quantum statistics.
in which to track the occupation probability f k of state k, we must sum the rates over all other initial states k1 , and all final states k0 and k01 . Thus, the sum converted to an integral will take the form R dk1 dk0 dk01 (...). The electron-electron scattering rate ∑k1 ,k0 ,k01 (...) → (2π )3d 0 0 S(k , k1 → k, k1 ) has the repulsive Coulomb interaction energy as the scattering potential, and respects the exchange process of fermions as was discussed in Chapter 20. With a better understanding of the microscopic nature of the collision term16 of the BTE, let us now see one of the rather remarkable theorems first proven by Boltzmann.
21.4 H-theorem and entropy In 1872, Boltzmann ”derived”, from Newton’s laws of motion, the second law of thermodynamics (see Chapter 4), which states that spontaneous processes always increase the entropy of a system. He started with the question17 : how do collisions change the entropy of the electron system with time? We will see now that collisions take a system from an excited state of low entropy to its equilibrium state, where the entropy is maximized. This will occur in a manner that the rate of change of entropy is positive (Boltzmann’s H-theorem). In fact the BTE will enable us to answer a far more general question: how do collisions and scattering determine the time evolution of any physical property of interest? Let us denote the physical property by φk , which can be a time-dependent function of the wavevector k and the occupation probability f k of the BTE. For now, let us focus on the effect of collisions when there are no concentration gradients or forces, implying (vk · ∇r f k + Fh¯ · ∇k f k ) = 0 in Equation 21.13. Denoting Cˆ f = ∂ fk ∂t |coll. , the time evolution of the macroscopic (measurable) value of the physical property ρφ (r, t) = ∑k φk f k (r, t) is given by ∂ f k · φ¯ k , coll. ∂t k k k (21.18) ∂φ where18 φ¯ k = φk + f k ∂ f k . So to find how collisions change the physk ical property φk , we must evaluate the sum (or integral) of Equation 21.18 for the appropriate property. For the collision term (∂ f k /∂t)|coll. of Equation 21.18, consider the situation when defect and phonon scattering are absent. Then, electron-electron scattering of Equation 21.17 survives as the only term in the collision integral of the BTE. Neglecting electron spin, every electron-electron scattering event is elastic19 , and follows S(k, k1 → k0 , k01 ) = S(k0 , k01 → k, k1 ). The rate of change of ρφ in Equation 21.18 then becomes ∂ρφ ∂ = ∂t ∂t
18 Note that this is an example of tak-
ing the moments of the BTE. This specific equilibrium case is generalized to nonequilibrium situations in Section 21.6. Also note that if φk depends on k and on f k (and hence on time), then φ¯ k = ∂φ φk + f k ∂ f k . If φk depends on k and k not on f k , then φ¯ k = φk .
19 Electron-electron scattering is analo-
gous to the inter-molecular collisions in a gas. Spin is an internal degree of freedom of the electron, analogous to the internal degrees of freedom of molecules, such as rotational or vibrational modes.
∑ φk f k (r, t) = ∑
∂ f k ∂φ · φk + f k k = coll. ∂t ∂ fk
∑
∂ρφ = ∑ S(k0 , k01 → k, k1 ) × φ¯ k × ∂t k,k ,k0 ,k0 1
1
[ f k0 f k0 (1 − f k )(1 − f k1 ) − f k f k1 (1 − f k0 )(1 − f k0 )]. 1
1
(21.19)
21.4 H-theorem and entropy 491
The summation is over all four k-wavevectors indicated in Fig. 21.7 (d). By solving Exercise 21.2 with a mathematical trick interchanging the four k0 s using physical arguments, you can show (as Boltzmann did), that the time evolution of the property ρφ of the electron system is then given by the somewhat non-intuitive relation that holds for any φk : ∂ρφ 1 =− S(k, k1 → k0 , k01 ) × ∂t 4 k,k ∑ 0 0 ,k ,k 1
1
(φ¯ k + φ¯ k1 − φ¯ k0 − φ¯ k0 ) × 1
[ f k f k1 (1 − f k0 )(1 − f k0 ) − f k0 f k0 (1 − f k )(1 − f k1 )]. 1
1
(21.20)
For a demonstration of the deep physics hiding in Equation 21.20, let us first choose that particular φk such that the corresponding ρφ represents the entropy of the electron system. Recall from Chapter 4, Equation 4.3 that the entropy of an electron state k is given in terms of its occupation function f k by Sk = −k b [ f k ln f k + (1 − f k ) ln(1 − f k )] due to its fermionic statistics, in a general non-equilibrium situation20 . To find the rate of change of the entropy of the entire distribution ρφ (r, t) = −k b · ∑k [ f k ln f k + (1 − f k ) ln(1 − f k )], direct substitution verifies that we should choose the function S 1 − fk φk = k = −k b ∑ ln f k + ln(1 − f k ) . (21.21) fk fk k Then, from Equation 21.18 we have
∂φ 1 − fk φ¯ k = φk + f k k = k b · ln( ), ∂ fk fk
(21.22)
and using it in Boltzmann’s form in Equation 21.20, we obtain ∂ρφ k =− b ∂t 4
∑
k,k1 ,k0 ,k01
S(k, k1 → k0 , k01 ) ×
[ f k f k1 (1 − f k0 )(1 − f k0 ) − f k0 f k0 (1 − f k )(1 − f k1 )] × 1 1 1 − f 1 − fk 1 − f k0 1 − f k01 k1 [ ln( · ) − ln( · )] fk f k1 f k0 f k0 1 ∂ρφ k =⇒ = + b ∑ S(k, k1 → k0 , k01 ) × ∂t 4 k,k ,k0 ,k0 1
1
[ f k f k1 (1 − f k0 )(1 − f k0 ) − f k0 f k0 (1 − f k )(1 − f k1 ) ] × 1 1 | {z } | {z } x2
x1
[ ln[ f k f k1 (1 − f k0 )(1 − f k0 )] − ln[ f k0 f k0 (1 − f k )(1 − f k1 )] ] 1 1 | {z } | {z } ln( x2 )
=⇒
∂ρφ k =+ b ∂t 4
ln( x1 )
x2
∑0 S(...) × (x2 − x1 ) × ln( x1 ). ks
(21.23)
20 The script S is used to distinguish en-
tropy from the scattering rate S.
492 No Turning Back: The Boltzmann Transport Equation
Fig. 21.8 Visual proof that ( x2 − x1 ) × ln( xx2 ) ≥ 0, and as a result, the entropy 1 ρφ = S(t) increases with time. This is analogous to the second law of thermodynamics, applied to the electron system. 21 Boltzmann called H = f ln( f ), ef-
fectively the negative of what we call entropy today and showed that H decreases with time. This is why it is referred to as the H-theorem. There are no 1 − f factors; Boltzmann’s f was the classical occupation function of molecules of a gas in the (r, p) phase space. Using the conservation arguments discussed in the next section, he derived the Maxwell–Boltzmann distribution of gas molecules at equilibrium. Historically several attempts have been made to find violations of the Boltzmann H-theorem, famously resulting in the formulation of quantum mechanics by Max Planck, and several advances in mathematics resulting in the proof of the Poincare conjecture. See Exercise 21.6 for further exploration.
Since k b and the scattering rate S(...) are positive, whether the macroscopic entropy of the electron system increases or decreases with time hinges on the sign of the factor ( x2 − x1 ) × ln( xx2 ) in Equation 21.23. 1 Because the natural log is a monotonic function, for any choice of the pair ( x2 , x1 ) we have the identity ( x2 − x1 ) × ln( xx2 ) ≥ 0, as seen 1 in Fig. 21.8. This is why the entropy of the electron gas left to itself with no external fields will either increase, or stay constant with time due to collisions. This is Boltzmann’s H-theorem for the rate of change of entropy due to collisions21 . A typical situation is shown in Fig. 21.8: when an external force (say due to an electric field) is turned off, the entropy of the electron system spontaneously increases from the steady state value Sss to its equilibrium value Seq in a manner that the slope is positive. In the next section, we use the fact that equilibrium is defined when the rate of entropy production ceases: this happens when x2 = x1 in the Equation 21.23. This fact, with the fundamental conservation laws of physics will allow us to derive the equilibrium f k , which will turn out to be the Fermi–Dirac distribution f 0 = 1/(exp[( E − µ)/k b T ] + 1) for electrons. Fig. 21.8 shows a zoomed in plot of the entropy function showing that there may be regions where the local dS(t)/dt < 0. This is because correlations can violate the approximation of independent particles of the Liouville equation that led to the BTE. The entropy on the coarser scale must increase, maximize, and stop changing at equilibrium. The startling conclusion about the unidirectional, irreversible path a physical system spontaneously takes towards equilibrium, which is concluded here for electrons (that follow quantum statistics and quantum mechanics) remains unchanged for the molecules of a gas of classical particles subject to Newton’s laws of motion and classical collisions. In fact, this is Boltzmann’s original ”derivation” of the second law of thermodynamics, which explains why, for example, a gas that spontaneously expands from a small to a large volume does not return back spontaneously to the small volume. In a similar way, electrons or holes that spontaneously diffuse out from a region of a semiconductor will not spontaneously return to that volume: this process is irreversible. Associated with every spontaneous process therefore is an arrow of time that always points towards increasing entropy: dictating that spontaneous physical processes are irreversible. A ”video” of the process can unambiguously identify if it is allowed by the physical laws of nature and distinguish the natural from the unnatural. What may seem rather mysterious is: where exactly did the irreversibility creep in? The microscopic laws governing the dynamics of systems are Schrodinger’s equation in quantum mechanics, and New¨ ton’s laws in classical mechanics. Both are completely reversible because replacing t → −t does not change the equations and solutions. Scattering processes are also completely reversible for the whole system. Boltzmann’s initial explanation was that after each scattering event, the particles are free to choose any one of the many final states that satisfy the conservation laws (of momentum, energy, angular mo-
21.5
mentum, etc.). He coined the word molecular chaos (stosszahlansatz) for this phenomena. Though this interpretation has been hotly contested and argued till this day, Boltzmann’s overall conclusion that equilibrium is the state of highest entropy, and any system left to itself will irreversibly end up in the equilibrium state via collisions and scattering is not just unscathed, it forms the foundation of thermodynamics, statistical mechanics and quantum mechanics22 . See Exercise 21.3 for further thoughts on this matter. Let us turn our attention to the summit of entropy: the equilibrium state.
21.5 Equilibrium distribution The solution of the BTE gives the occupation probability f k of each k state for particles that have any general bandstructure of the form E(k), in both equilibrium, and non-equilibrium conditions. In Chapter 4 we found by maximizing the entropy and enforcing quantum statistics that the equilibrium distributions of fermions and bosons are given by the Fermi–Dirac and Bose–Einstein functions respectively. It should be possible to derive these functions directly from the BTE too. We show now23 that it indeed is the case, and is a consequence of the fundamental conservation laws. Let us consider the equilibrium case when Equation 21.20 holds. Unlike the last section where we chose the specific φk for entropy density, we now choose a φk that is not a function of f k , but k alone, implying ∂φ φ¯ k = φk + f k ∂ f k = φk from Equation 21.22, and therefore k
∂ρφ 1 =− S(k, k1 → k0 , k01 ) × (φk + φk1 − φk0 − φk0 ) × 1 ∂t 4 k,k ∑ 0 ,k0 ,k 1
1
[ f k f k1 (1 − f k0 )(1 − f k0 ) − f k0 f k0 (1 − f k )(1 − f k1 )]. 1 1 (21.24) The LHS of this equation is zero for a macroscopic physical quantity ρφ that is conserved in collisions, because then it does not change with time. What are such quantities? The number of electrons is conserved in a scattering event: for example in electron-electron scattering, there are two electrons before and two after the scattering event. This is captured by φk = φk1 = φk0 = φk0 = 1, and φk + φk1 = φk0 + φk0 . 1 1 Similarly, choosing φk = Ek , φk1 = Ek1 , φk0 = Ek0 , and φk0 = Ek0 1 1 again requires φk + φk1 = φk0 + φk0 because Ek + Ek1 = Ek0 + Ek0 is 1 1 the statement of energy conservation in the collision. Corresponding statements hold for all other physical properties (such as momentum and angular momentum) that are conserved in the collision process. Note that the above quantities are conserved in collisions even when the system is not in equilibrium. Therefore physical quantities φk that ∂ρ are conserved in collisions and make ∂tφ = 0 exhibit the additive feature that ∑before φk = ∑after φk0 .
Equilibrium distribution 493
22 Richard Tolman in his book The Prin-
ciples of Statistical Mechanics has this to say about Boltzmann’s H-theorem: ”The derivation of this theorem and the appreciation of its significance may be regarded as one of the greatest achievements of physical science”.
23 We tackle the case of fermions here,
and you do the same for bosons in Exercise 21.8.
494 No Turning Back: The Boltzmann Transport Equation
In the last section we found that equilibrium is characterized by zero entropy production ∂∂tS = 0, which strictly requires f k f k1 (1 − f k0 )(1 − f k0 ) = f k0 f k0 (1 − f k )(1 − f k1 ) 1
1
f k0 f k1 fk f k0 1 =⇒ · = · 1 − f k 1 − f k1 1 − f k0 1 − f k0
1
f k0 f k1 f f 0 1 =⇒ ln( k ) + ln( ) = ln( k ) + ln( ). (21.25) 1 − fk 1 − f k1 1 − f k0 1 − f k0 1
f ln( 1−kf ) k
Evidently, the dimensionless functional form has the additive form for quantities that are conserved in collisions ∑before φk = f ∑after φk0 . If we set ln( 1−kf k ) = c · Ek where c is a constant of inverse energy dimensions, then we obtain f k = 1/[exp[c · Ek ] + 1]. Though this form looks similar to the Fermi–Dirac distribution, it cannot be quite correct because energy is not the only conserved quantity in the collision. In electron-electron collisions, the quantities that are conserved are particle number (φk = 1), momentum (φk = p), and energy (φk = Ek ). Then the quantity that must be simultaneously conserved is the dimensionless linear combination a · 1 + b · p + c · E, where a, c are scalar constants, and b is a vector constant. The correct form of the equilibrium distribution function is then ln(
fk 1 ) = a + b · p + c · E =⇒ f k = . (21.26) E − u ·p−µ 1 − fk exp [ k k T ] + 1 b
24 Boltzmann
applied this technique to obtain the distribution of velocities of gas molecules of mass M as 3
− Mv
2
M f (v) = ( 2πk ) 2 4πv2 e 2kb T from NewbT ton’s laws. See Exercise 21.7.
From requirements of entropy and particle number, the constants are identified to be a = µ/k b T, where µ is the chemical potential as a measure of the particle number, c = −1/k b T, and b = u/k b T, where u is the drift velocity of the particle. Now in a semiconductor or a metal, an electron will collide every now and then with a stationary heavy donor atom or defect in an elastic manner such that the electron energy is conserved, but it’s momentum is not. Thus we can remove the momentum conservation requirement if we want the distribution E −µ function of electrons alone, and recover f k0 = 1/(exp [ kk T ] + 1), the b form of the Fermi–Dirac distribution that is correct for this situation24 . We thus arrive at a very important result: the solution of the BTE for electrons at thermal equilibrium is the Fermi–Dirac distribution function, which we label as f k0 . This is a remarkable statement, because it is independent of the fact that the electrons may be colliding with various kinds of defects, scattering off of lattice vibrations of several kinds, or even thermally excited from one band to another. Though several scattering events are responsible to take electrons to their equilibrium distribution, the equilibrium distribution itself is independent of the nature or type of the scattering events. What the nature of scattering events do control are the dynamical and steadystate properties of the electron population f k when it is taken away, or brought back towards equilibrium. For example, the scattering rates
21.5
determine exactly how long it takes for f k to reach the equilibrium distribution f k0 when an external perturbation is removed. Scattering processes also determine how long it takes for the electron system to go from the equilibrium f k0 to a steady-state f kss when a perturbation is turned on, and the precise steady state value f kss it can reach. These deviations from equilibrium are discussed in the next section. In the rest of this section, we discuss a few more properties of the equilibrium distribution and microscopic processes that occur in equilibrium25 . Principle of detailed balance: Equilibrium is defined by the condition when the RHS of the BTE is zero, i.e., Cˆ f k = 0. For general inelastic scattering given by Equation 21.15, this requires the equilibrium distribution f k = f k0 to satisfy
Equilibrium distribution 495
25 An extended preview of the BTE was
provided in Chapter 16, Section 16.6. I recommend reviewing that section before proceeding in this chapter.
∑0 [S(k0 → k) f k00 (1 − f k0 ) − S(k → k0 ) f k0 (1 − f k00 )] = 0 k
=⇒ S(k → k0 ) f k0 (1 − f k00 ) = S(k0 → k) f k00 (1 − f k0 ) =⇒
k0 )
S(k → = S(k0 → k)
1 − f k0 f k0
·
f k00
1−
f k00
=e
Ek − E 0 k kb T
.
26 Detailed balance results from statisti-
(21.27)
In the above equation every single term inside the sum must be identically zero. The rate of every scattering event is exactly counterbalanced by the reverse process. This is the principle of detailed balance26 . If it were not so, and say S(k → k0 ) f k0 (1 − f k00 ) > S(k0 → k) f k00 (1 − f k0 ), the occupation of state f k0 will decrease with time as particles scatter out faster than they scatter into this state, and f k00 will correspondingly increase, making f k 6= f k0 and violating equilibrium. Elastic scattering rates: The scattering rate S(k0 → k) and its reverse S(k → k0 ) given by Fermi’s golden rule are independent of the occupation functions f k and f k0 of electrons, but depend on the nature of scattering mechanisms. From the boxed Equation 21.27 it is seen that if Ek = Ek0 and the scattering is elastic, S(k → k0 ) = S(k0 → k). The collision term of BTE for elastic scattering then becomes27 ∂ f k elastic | = ∂t coll.
∑0 [S(k0 → k) f k0 (1 − f k ) − S(k → k0 ) f k (1 − f k0 )] k
=⇒
∂ f k elastic | = ∂t coll.
∑0 S(k0 → k)[ f k0 − f k ]. (21.28) k
Scattering of electrons by charged ions and various static defects is an elastic process, as will be discussed in Chapter 23. This simple form of the collision term will allow a solution of the BTE to evaluate their effect of such scattering on electron transport via the definition of a relaxation time, discussed in the next section. Note in the above boxed equation the coefficient of f k on the RHS is a relaxation time of the
cal mechanics. It should not be confused with microscopic reversibility. The principle of microscopic reversibility is a purely quantum mechanical result with no connection to statistical mechanics. It states that when all particles are considered, the scattering rate in one direction is identical to its reverse, because of the time-reversibility of the Schrodinger ¨ equation. For example, for elastic scattering S(k → k0 ) = S(k0 → k) is microscopic reversibility, and not detailed balance.
27 The boxed expression would have been obtained if the Pauli exclusion principle was not enforced in the scattering. Can you explain why it is so, since the Pauli exclusion principle can not be violated for electrons!
496 No Turning Back: The Boltzmann Transport Equation
Fig. 21.9 Detailed balance in electronphonon scattering events.
form fk.
1 τ (k)
= ∑k0 S(k0 → k), making the RHS linear in the unknown
Inelastic scattering rates: Using the boxed Equation 21.27, the collision term of the BTE for inelastic scattering events is ∂ f k inelastic | = ∂t coll.
28 For example if E − E 0 = h¯ ω k k
k b T, the scattering term approximates h¯ ω to Cˆ f k ≈ ∑k0 S(k → k0 )e kb T ( f k0 − f k0 f k ), to which a relaxation time approximation may be crudely applied. This physical condition occurs for polaroptical phonon scattering in widebandgap semiconductors, where h¯ ω k b T holds at room temperature. Note that the elastic case is a special case of inelastic scattering when E(k ) = E(k0 ). 29 For phonons in equilibrium at a tem-
perature T, the distribution function is n0ω = 1/(exp[h¯ ωq /k b T ] − 1), the Bose–Einstein distribution.
30 Note that the argument of h¯ ω for a
phonon – could be replaced by a photon, or other bosonic quanta with a corresponding dispersion. In other words, the concepts of absorption, spontaneous emission, and stimulated emission appear whenever fermionic (e.g. electron) systems interact with bosonic (e.g. photons, phonons) systems. Such interactions play a critical role in electron transport in crystals as well as in electronphoton or light-matter interactions.
∑0 S(k → k0 )[e k
E 0 − Ek k kb T
f k0 (1 − f k ) − f k (1 − f k0 )].
(21.29) The collision term does not simplify further for inelastic scattering. Because the RHS is in general a non-linear function of f k , a relaxation time cannot be strictly defined, except for special cases28 . Fig. 21.9 depicts an inelastic scattering process for the electron which transitions from state k → k0 of energies E(k) and E(k0 ) which are different by h¯ ω. The energy E(k) − E(k0 ) = h¯ ω and momentum h¯ k − h¯ k0 = h¯ q lost by the electron must be gained by the object with which the electron collides, since the total energy and momentum of the (electron + scatterer) system is conserved. Consider the case when the electron scatters from phonons. The electron bandstructure E(k) in Fig. 21.9 dictates the possible h¯ ω and h¯ q for electron. The corresponding energy dispersion h¯ ωq for phonons depends on the crystal structure, masses of the atoms, and the chemical bond strengths, when pictured as a periodic mass-spring system. The resulting phonon energy dispersion h¯ ωq dictates which energies of phonons are available at particular wavevectors q. Only those electron scattering processes that match the phonon energy and momentum relation are allowed; this is treated in detail in Chapter 22. For the scattering process shown, the number of phonons29 in the mode nω increases by one, to nω + 1. The scattering rate for phonon emission by electrons is proportional to nω + 1, whereas the scattering rate for phonon absorption is proportional to nω , hence satisfying Equation 21.27. The factor of 1 in emission depicts the spontaneous part of emission, and the proportionality to nω the stimulated part of emission. Absorption of phonons is always stimulated. Note that the RHS of BTE in Equations 21.28 and 21.29 are manifestly non-equilbrium terms, but the arguments of the physics of equilibrium has baked into it several layers of physics, which will facilitate its solution30 .
21.5
Equilibrium distribution 497
Non-uniform equilibrium: We can coax out more from the BTE about the nature of the equilibrium distribution function. Now we show that the most general equilibrium distribution takes the form f k0 (r) =
1 E (r)+ E(k)− EF exp[ c k T ]+1 b
, where F = −∇r Ec (r).
(21.30)
At equilibrium, we will strictly have ∇r EF = 0, and ∇r T = 0. This means the Fermi level and the temperature must be constant in space, but the conduction band edge Ec (r) can vary in space. At equilibrium F = −∇r Ec (r) is the internal force on an electron: it makes the electron roll down a gradient in the conduction band edge Ec (r). An example of the non-uniform equilibrium condition encountered in semiconductor quantized structures is shown in Fig. 21.10. To prove Equation 21.30, let us assume the more general form in which Ec , EF , T are allowed to vary in space and time in the form: 1
Ec (r, t) + E(k) − EF (r, t) . k b T (r, t) +1 (21.31) Substituting this form into Equations 21.28 and 21.29 shows that it satisfies the requirement for equilibrium that the RHS of the BTE Cˆ { f k (r, t)} = 0. But this is not enough: for f k (r, t) to be an equilibrium distribution, it must also make the LHS of the BTE= 0: f k (r, t) =
eη (r,k,t)
, where η (r, k, t) =
∂ f k (r, t) F +v · ∇r f k (r, t) + · ∇k f k (r, t) = 0 ∂t } | {z } h¯ | {z } | {z ∂ f k (r,t) ∂η ∂η · ∂t
=⇒
∂ f k (r,t) ∂η ·∇r η
Fig. 21.10 Nonuniform equilibrium in a semiconductor graded-index separate confinement heterostructure (GRINSCH) quantum well laser active region.
∂ f k (r,t) ∂η ·∇k η
∂ f k (r, t) ∂η F [ + v · ∇r η + · ∇k η ] = 0. ∂η ∂t h¯
(21.32)
Since ∂ f k (r, t)/∂η 6= 0, the term in the square bracket must be zero. Noting that ∇k η (r, k, t) = (∇k E(k))/k b T (r, t) = (h¯ v)/k b T (r, t), and using the condition for equilibrium that all time dependences must vanish implies v · [F + k b T (r)∇r η (r, k)] = 0, which, when written as 1 [F + ∇r Ec (r)] − ∇r EF (r) + T (r)[ Ec (r) + E(k) − EF (r)] ∇r ( ) = 0, | {z } | {z } T (r) | {z } =0 =0 =0
(21.33)
gives us the constraints that F = −∇r Ec (r), and EF and T cannot vary in space. Thus the general non-uniform equilibrium distribution form is indeed given by Equation 21.30. In Equation 21.33 the temperature cannot vary with r because E(k) allows the choice of any k, implying ∇r (1/T (r)) = 0 must be met, and the rest of the terms must also individually be zero. Note that internal forces are allowed in equilibrium via F = −∇r Ec (r), with corresponding electric fields. The BTE
498 No Turning Back: The Boltzmann Transport Equation
is quantitatively enforcing what we knew all along, for example these are the internal electric fields in Schottky or pn junctions, or in heterostructures, as again indicated in Fig. 21.10. For electrons in the conduction band of 3D bulk silicon for example, the equilibrium distribution function is 1
f k0 (r) = exp[
2 k2x Ec (r)+ h¯2 ( m xx
k2
,
k2
x + x )− E + myy F mzz
kb T
(21.34)
]+1
which explicitly shows the dependence on space via r and on the bandstructure via k = (k x , k y , k z ). If we consider a silicon pn junction in equilibrium, this expression holds everywhere: in the n-side, the pside as well as inside the depletion region. Remembering that the BTE is defined separately for each band, the force F and the bandstructure of the valence band is defined along similar lines.
31 The
actual wavefunction of the wavepacket is the product of the envelope and the time-independent periodic part of Bloch function uk (r), this factor cancels out; indeed this is the reason why the effective mass approximation works in the first place. The spatial variation of the conduction band edge Ec (r) is retained here in the WKB sense, discussed in Chapter 16, Section 16.4.
Non-uniform carrier distributions: The general non-uniform equilibrium electron concentration at any point r in a form suitable for non-uniform potentials and especially for quantized heterostructures is n(r) = ∑k |Ck (r)|2 f k0 (r), where Ck (r) is the k-dependent envelope function of the wavepacket centered at k. This result comes about from a mix of quantum mechanics and statistical mechanics in the following way. In the absence of scattering, the solution of the time-independent effective mass Schrodinger equation discussed in ¨ Chapters 14 and 16 is the time-independent envelope function31 Ck (r). With a scattering potential W (r, t) the solution is a linear combination Ψ(r, t) = ∑k ψk (t)Ck (r) which must satisfy ∂Ψk (r, t) = [ Ec (r) + E(−i ∇) + W (r, t)]Ψk (r, t) ∂t ∂ψ (t) i¯h ∑ Ck (r) k = ∑ ψk (t)[ Ec (r) + E(k) + W (r, t)]Ck (r), ∂t k k i¯h
(21.35)
where multiplying by Ck? 0 (r ), integrating over all space, and using the R orthogonality drCk? 0 (r)Ck (r) = δk,k0 gives i¯h
∂ψk (t) = ψk (t)[ Ec (r) + E(k)] + ∑ ψk0 (t)hCk0 (r)|W (r, t)|Ck (r)i ∂t k0
=⇒
∂ψk (t) i + [ Ec (r) + E(k)]ψk (t) = ∂t h¯
∑0 ψk0 (t) k
Wkk0 . i¯h (21.36)
R
Here Wkk0 = hCk0 (r)|W (r, t)|Ck (r)i = drCk? 0 (r)W (r, t)Ck (r) is the same scattering matrix element that appears in Fermi’s golden rule. Equation 21.36 shows how the time-dependent coefficients evolve in time due to scattering. In the absence of scattering W (r, t) = 0 makes
21.6 The RTA: time to relax! 499
the RHS zero, and the amplitudes evolve in time in typical stationaryE (r)+ E(k) state fashion ψk (t) ∼ exp[−i c h¯ t]. Statistical mechanics guar32 antees that with scattering by any W (r, t), the solution of Equation 21.36 satisfies hψk? 0 (t)ψk (t)i = f k0 (r)δk,k0 with f k0 (r) given by Equation 21.30. Then, the electron distribution at equilibrium is given by n(r, t) = Ψ? (r, t)Ψ(r, t) =
∑
k,k0
ψk? 0 (t)ψk (t) | {z }
Ck? 0 (r)Ck (r)
hψk? 0 (t)ψk (t)i= f k0 (r)δk,k0
=⇒ n0 (r) = ∑ |Ck (r)|2 f k0 (r) = ∑ k
k
exp[
|Ck (r)|2
Ec (r)+ E(k)− EF ]+1 kb T
. (21.37)
As an example of its application, consider an infinitely deep quantum well in the z-direction shown in Fig. 21.11, where the electron is free to move in the x − y plane. The electron √ envelope function from Chapter 14 is given by C ( x, y, z ) = [ 2/Lz sin(πnz z/Lz )] · n z p ei(k x x+ky y) / L x Ly and eigenvalues Enz + h¯ 2 (k2x + k2y )/(2m?c ). If the Fermi level is at EF the electron distribution is given by n0 ( x, y, z) = gs gv
=
2gs gv Lz
∑ sin2 ( nz
=⇒ n0 (z) =
|Cnz ( x, y, z)|2
∑
k x ,k y ,nz
πnz z 1 )· Lz (2π )2 gs gv m?c k b T π¯h2 Lz nz
exp[
Z ∞
k =0
∑ sin2 (
Enz +
h¯ 2 (k2x +k2y ) − EF 2m? c kb T
32 Though we do not discuss the rigor-
ous proof, it is justified in the following heuristic way. The Schrodinger equation ¨ with scattering may be solved in principle (if not in practice) at any instant t with the scattering potential W (r, t) to reveal all eigenfunctions at that instant, which by the expansion theorem are the linear combinations ∑k ψk (t)Ck (r). From statistical mechanics we know the average equilibrium occupation probability of state k is f k0 (r), which must be related to the quantum mechanical amplitude via |ψk (t)|2 . But because of scattering by the random potential W (r, t), electrons transition between states k ↔ k0 , their amplitudes are mixed, causing fluctuations, resulting in hψk? (t)ψk0 (t)i = f k0 (r)δk,k0 . W (r) that are independent of time cause elastic scattering that does not randomize the phase of ψk (t); only the amplitudes fluctuate. But timedependent W (r, t) cause inelastic scattering, and the phases also fluctuate. Fluctuations are measurable as macroscopic noise in electrical currents and voltages and reveal information about the microscopic W (r, t).
]+1
2πkdk
exp[
h¯ 2 k2 −( E − E ) nz F 2m? c
kb T
πnz z ) · ln[1 + e Lz
]+1
E F − En z kb T
]. (21.38)
21.6 The RTA: time to relax! We resume our journey towards the promised land of actually solving the BTE to obtain the f k (r, t) in response to external Lorentz forces due to electric and magnetic fields F = −q(E + v × B), or temperature gradients ∇r [ T (1r) ]. The BTE with such forces is ∂ fk F + v · ∇r f k + · ∇k f k = Cˆ { f k }, ∂t | {z h¯ } ˆ fk } L{
(21.39)
ˆ f k }. For general inelastic scatwhere the force term is labeled as L{ tering, from Equation 21.29, Cˆ { f k } = ∑k0 S(k → k0 )× [exp[( Ek0 − Ek )/k b T ] f k0 (1 − f k ) − f k (1 − f k0 )] makes the RHS non-linear in f k . The sum over k0 converts to an integral, making the BTE in its full glory a non-linear integro–differential equation, and in several dimensions to boot, which is difficult to solve. But a remarkable simplification is possible by appealing to the physics for an approximation.
Fig. 21.11 Equilibrium electron distribution in an infinitely deep quantum well. Highly non-uniform electron distributions in quantized structures are obtained by using the effective mass approximation with the generalized equilibrium Fermi–Dirac distribution function using Equation 21.37. Note that the distribution n0 (r) here is only the envelope portion; the full wavefunction will also have a Bloch-periodic part with the periodicity of the lattice, of the form n0 (r)|u(r)|2 . In quantum well HEMTs and lasers, the EF is tuned by field effect or carrier injection, thereby controlling the operation of such devices.
500 No Turning Back: The Boltzmann Transport Equation
ˆ f } and the collision term Cˆ { f } of the BTE when the Fig. 21.12 Schematic depiction of the balance between the LHS force term L{ system is taken out of, and then returned to equilibrium. The center figure shows the evolution of the distribution function f k (t), the external force F (t), and the entropy S(t) with time. Fig. 21.13 shows the time evolution in k-space.
33 The only known exact solution of the
BTE is for a specialized scattering potential for the classical ideal gas, obtained by James Clerk Maxwell around the time Boltzmann formulated the equation.
34 This is also justified through a series {1}
solution of the form f k ≈ f k0 + f k + {2} {1} f k + ..., where f k is the first order, {2} f k is the second order, and so on. If the second and higher orders are neglected, the first order solution is indeed in the {1} form f k ≈ ( f k − f k0 ), justifying the relaxation-time approximation.
Consider Fig. 21.12, in which the time evolution of the electron distribution function f k (t), the force F (t), and the corresponding entropy S(t) of the electron system are shown schematically33 . At equilibrium, there are no external forces and the entropy is at its maximum. In the LHS Lˆ 0 = 0, the RHS Cˆ { f k0 } = 0, and they ”balance”, indicated by the see-saw at the bottom. The instant an external force is turned on, Lˆ 0 → Lˆ 1 6= 0, and the balance is broken because the LHS is nonzero while the RHS is zero. The entropy decreases, and f k (t) will now change to find a new steady-state value f kss which restores the balance in the BTE as shown on the top. If after steady-state is reached, the external force is removed, the electron distribution must relax back from f kss → f k0 . If f k did not deviate too far from f k0 , when returning to equilibrium it is fair to approximate the BTE by34
( f k − f k0 ) ∂ fk = Cˆ { f k } ≈ − ∂t τ
t
=⇒ f k (t) ≈ f k0 + ( f kss − f k0 )e− τ . (21.40)
If the BTE can be written in such a linear fashion, its solution in Equation 21.40 indicates that f k (t) will relax from f kss to f k0 as t → ∞ exponentially with a characteristic relaxation time τ. This is the relaxation time approximation (RTA) of the BTE. Its simplicity underpins its immense practical value. From the discussion of equilibrium, all microscopic scattering processes are captured by the time τ. The path out of equilibrium with this approximation is
( f k − f k0 ) ∂ fk F + v · ∇r f k + · ∇k f k = Cˆ { f k } ≈ − . ∂t h¯ τ
(21.41)
21.7
One formula to rule them all 501
The simplified solution was discussed in Chapter 16, Section 16.6. For a force F and concentration gradient ∇r f k , when the distribution approaches steady state (∂ f k /∂t = 0), the solution to this equation is ∂f0 f k = f kss ≈ f k0 + τvk · [(− k )F + (−∇r f k0 )], ∂E | {z } | {zk } diffusion
(21.42)
drift
where the group velocity vk = h¯ −1 ∇k Ek and the small deviation approximation −∂ f k /∂Ek ≈ −∂ f k0 /∂Ek are used. The two components indicated underpin the very useful drift-diffusion method of analysis of charge transport in semiconductors. Fig. 21.13 indicates the time evolution of the distribution function, depicting possible f k in a 2D system at the various instants of time A–E indicated in Fig. 21.12. The solution in Equation 21.42 gives f kss , the steady state distribution at time instant C. The shift of the occupation function depends roughly linearly on τ, leading to a higher conductivity for larger τ. This approximate solution of the BTE expresses the unknown distribution function f k in terms of all known quantities and driving forces on the RHS. Several transport parameters can now be evaluated from it. Consider the flow of quantities such as charge, heat, spin, etc that depend on the wavevector k through a known functional form φk with a corresponding volume density ρφ = f k φk /Ld . The net φ-current Jφ , and a corresponding φ-conservation rule from Equation 21.42 are Jφ =
1 Ld
∑ φk vk f k k
and
∂ρφ + ∇r · Jφ = Gφ − Rφ , ∂t
(21.43)
where Gφ , Rφ are generation and recombination rates and Ld is the volume in d-dimensions. We use the current and continuity expressions in the following sections to understand the effect of scattering on electrical conductivity and general transport coefficients of semiconductors. In applying the approximate solution of Equation 21.42 in evaluating various currents, we must exercise caution because as discussed in the last section, a relaxation time may be defined for elastic scattering but not for inelastic scattering. When defined, τ is in general dependent on k, as was seen in the preview of ionized impurity scattering in Chapter 16, Section 16.8. We will discuss its validity as we use it to connect to several experimental transport parameters.
21.7 One formula to rule them all Fig. 21.14 shows how the entropy of the electron system changes when external forces are applied. Recall that the entropy of state k of the electron system with distribution function f k is sk = −k b [ f k ln( f k ) + (1 − f k ) ln(1 − f k )], and its total entropy is S = ∑k sk . The three
Fig. 21.13 Schematic diagram showing the distribution function f k (t) of Fig. 21.12 in k-space at times indicated by A, B, C, D, and E. A circular Fermi-surface results from an isotropic effective mass. Note the elastic and inelastic scattering processes indicated. The Fermi-circle does not roll away, but gets ”stuck” at a distance roughly ∆k ∼ Fτ/¯h from the origin for a force F. When the force is removed, the distribution function returns to equilibrium. The non-equilibrium f k shape does not need to remain circular in the general case, even for isotropic effective masses. A typical relaxation time of τ ∼ 10−13 s, or 0.1 ps at room temperature for electron-phonon scattering in semiconductors and a small electric field 1 kV/cm leads to |∆k| ∼ 107 /m, less than (1/100)th of the Brillouin-zone edge.
502 No Turning Back: The Boltzmann Transport Equation 1.0 0.8 0.6 0.4 0.2 0.0 0.0
0.5
1.0
1.5
2.0
0.5
1.0
1.5
2.0
1.0 0.8 0.6 0.4 0.2 0.0 0.0
1.0 0.8 0.6 0.4 0.2 0.0 0.00
0.01
0.02
0.03
0.04
0.05
Fig. 21.14 The distribution function f k in (a) and entropy density sk in (b) shown as a function of k = |k| in equilibrium (solid lines), and steady state (dashed line). The area under sk is the total entropy of the electron system, shown in (c) as a function of the external perturbation g = gk . The entropy of the electron system decreases compared to its equilibrium value when an external force is applied. Notice that the entropy density is zero in regions where f k = 0 or f k = 1, and peaks in the transition region. Also note that the area under the entropy density curve sss k decreases compared the equilibrium density curve s0k .
quantities f k , sk , and S are shown in equilibrium and in steady state. Fig. 21.14 (a) shows that an external force such as an electric field changes the distribution function from f k0 → f k = f k0 − gk (∂ f k0 /∂Ek ), where gk = −τF · vk is a perturbation with energy units, characterized by the relaxation time τ, the external force F, and the electron bandstructure via group velocity vk . The distribution function reaches a steady state f kss if the external force is held constant (the dynamics was illustrated in Figures 21.12 and 21.13). Fig.21.14 (b) shows that as a result, the entropy density sk changes from its equilibrium value s0k to the steady-state value sss k . Fig. 21.14 (c) shows the total entropy S = ∑k sk as a function of the external perturbation g, indicating that the entropy of the electron system decreases when an external force is applied. This is because an electron distribution drifting in a direction has more order, and hence less entropy. When the external force 0 ss 0 is removed, f kss → f k0 , sss k → sk , and S → S in a characteristic time 0 of τ. The total entropy at equilibrium S is its maximum value. We will now prove that the rate of change of entropy due to collisions and scattering is always positive: dScoll. /dt ≥ 0, and that due to external forces is negative: dSdrift /dt ≤ 0. Taking the time derivative of S = −k b ∑k [ f k ln( f k ) + (1 − f k ) ln(1 − f k )], using f k = f k0 − gk (∂ f k0 /∂Ek ), and retaining up to linear terms in gk gives dS df f df g E−µ 1 = k b ∑ k ln( k ) ≈ −k b ∑ k [ k + ]≈− dt dt 1 − fk dt k b T kb T T k k
d fk , dt k (21.44) where the term with ( E − µ)/k b T leads to a net energy increase and non-ohmic effects, and is neglected. Writing S = Scoll. + Sdrift we get dScoll. 1 =− dt T
versibility relation S(k0 → k) f k00 (1 − f k0 ) = S(k → k0 ) f k0 (1 − f k00 ), and the identity
∂ f k0 ∂Ek
=−
f k0 (1− f k0 ) . kb T
k
d fk dSdrift 1 | and =− dt coll. dt T
∑ gk k
d fk | . (21.45) dt drift
∂f0
Using f k = f k0 − gk ∂Ek in the collision term in the RHS of the BTE35 , k
d fk | = dt coll. 35 Here we use the microscopic re-
∑ gk
∑ gk
∑0 [S(k0 → k) f k0 (1 − f k ) − S(k → k0 ) f k (1 − f k0 )] k
= ∑[S(k → k0 ) f k0 (1 − f k00 )[ k0
=⇒
d fk 1 |coll. = dt kb T
f k 0 (1 − f k ) f (1 − f k 0 ) − k0 ] f k00 (1 − f k0 ) f k (1 − f k00 )
∑0 S(k → k0 ) f k0 (1 − f k00 )[ gk0 − gk ], (21.46) k
where in the last step we have retained terms to first order in gk . Using
21.7
this form of
d fk dt |coll.
in Equation 21.45 for collision gives
dScoll. 1 = dt kb T2 gk ↔ gk 0
1 =⇒ = kb T2
dScoll. 1 =⇒ = dt 2k b T 2
∑0 S(k → k0 ) f k0 (1 − f k00 ) gk [ gk − gk0 ]
k,k
∑0 S(k → k0 ) f k0 (1 − f k00 ) gk0 [ gk0 − gk ]
k,k
∑0 S(k → k0 ) f k0 (1 − f k00 )[ gk − gk0 ]2 ≥ 0,
k,k
(21.47)
k, k0
where we have used the fact that the sum over cannot change upon interchanging gk ↔ gk0 , and then added the first two equations and divided by 2 to get the third. Since all terms in the RHS of the third form of dSdtcoll. are positive, including the perturbation which appears as [ gk − gk0 ]2 , we have proven that the rate of production of entropy due to collisions is always positive36 . drift To prove dSdt ≤ 0, we first note that when the electron distribution is in the steady-state, the net change of entropy is zero, implying ddtS = d(Scoll. +Sdrift ) = 0, and since dSdtcoll. ≥ 0, it follows that at steady state, dt dSdrift ≤ 0. The proof for the general case provides more insight into dt the physics. The LHS drift term of the BTE with Equation 21.45 gives37
dSdrift dt
One formula to rule them all 503
d fk F |drift = · ∇k f k + vk · ∇r f k dt h¯ ∂ f k0 1 ≈− v · [F + ∇r ( Ek − µ) + ( Ek − µ) T ∇r ( )] ∂Ek k T ∂f0 1 1 = − ∑ gk (− k ) vk · [F + ∇r ( Ek − µ) + ( Ek − µ) T ∇r ( )] T k ∂Ek T | {z }
36 Because collisions randomize the vedS
locities and energies, dtcoll. ≥ 0 makes dS physical sense. Note that dtcoll. ≥ 0 is seen in the slope of S(t) during the relaxation phase labeled ”D” in Fig. 21.12. The balancing drift portion where dSdrift ≤ 0 is labeled ’B’ in the same figdt ure. Both are shown in the k-space in Fig. 21.13. We recognize this behavior as the Boltzmann H-theorem again, but now for the particular solution f k = f k0 − gk (∂ f k0 /∂Ek ) of the BTE.
37 For a semiconductor, E − µ = E − k k
( EF − Ec ) = ( Ec + Ek ) − EF , the argument of the exponent in the Fermi–Dirac function. Note that ∇r Ek = 0, and so is ∑k f k0 vk because f k0 is symmetrical in k but vk is antisymmetrical.
f k − f k0
=⇒
1 1 = − (F − ∇r µ) · ∑ vk f k −∇r ( ) · ∑( Ek − µ)vk f k T T k k | {z } | {z } Ld
Ld ·j
Ld ·(Ju −µJ)
dSdrift 1 = − [(F − ∇r µ) · j + T ∇r ( ) · ( Ju − µJ )] ≤ 0. dt T | T {z } | {z } Joule heat
heat current
(21.48)
Several important currents are thus defined using entropy38 . They are: particle current j = (∑k vk f k )/Ld , charge current J = (−q)j, energy current Ju = (∑k Ek vk f k )/Ld , and heat current Jq = Ju − µJ. The rate of entropy production due to the drift term is thus negative, because both terms in the square bracket are positive. The reason is very physical: the driving force F − ∇r µ in the first term is the net electrochemical force. The first term is this force times the resulting current J,
38 The notation Ld implies a volume in d-
dimensions. Note that at equilibrium, all the defined net currents are zero.
504 No Turning Back: The Boltzmann Transport Equation
39 The discussion here attempts to an-
swer the question: ”Why does transport of electrons or heat occur”? The dynamical laws of quantum (or classical) mechanics can answer what processes occur, but not why, because the dynamical laws are symmetrical in time, and are therefore reversible. The answer to why is provided by thermodynamics because it identifies an arrow of time and explains irreversibility. It is explicitly seen that current conduction is a direct consequence of the production of entropy.
and since J ∝ F − ∇r µ, the product must be positive. This term is the Joule heat produced by the drifting electrons. The driving force in the second term ∇r ( T1 ) is a gradient in the temperature (or inverse temperature). Since the resulting heat current carried by electrons flows from higher to lower temperature, the product again must be positive. The proof of the negative sign of the rate of drift entropy change gives us a first glimpse of a deeper principle39 : that the rate of entropy production can be written as a sum of products of driving forces and resulting currents. It will pay off handsomely to develop the concept further. We will see now from thermodynamic arguments why the flow of electrons and heat is required to ensure the conservation of particles and energy, and the direction of their flow is dictated by entropy. This consideration will reveal a richer range of transport phenomena, in which the charge current is only a part of the puzzle, with heat current playing an important role. There is also a deep connection between them, first uncovered by Onsager. The first law of thermodynamics states that the energy dU added to a system partitions into doing work PdV, adding particles µdN, and causing irrecoverable entropy TdS via the energy conservation relation dU = TdS + pdV + µdN. For the electron system in solids, S N dV = 0. The volume densities u = U V , s = V , and n = V give the time dependence ∂u ∂s ∂n = T +µ . (21.49) ∂t ∂t ∂t Because the particle density n and energy density u follow conservation relations, inserting them into Equation 21.49 gives a continuity J −µJ equation and current density Js = u T for the entropy40 : dU = TdS + µdN =⇒
40 F · J is the conventional Joule heat.
∂n ∂u + ∇r · j = 0 and + ∇r · Ju = F · J =⇒ ∂t ∂t ∂s 1 ∂u ∂n 1 = ( − µ ) = [−∇r · (Ju − µJ) + F · J] ∂t T ∂t ∂t T ∂s Ju − µJ 1 µ −∇r T + ∇r · ( ) = [(−qj) ·(−q ∇r ( − ϕ)) + (Ju − µJ) · ( )] ∂t T T q T } | {z } | {z } | {z } | {z | {z } J Jq Js
Fq
E
=⇒
∂s + ∇r · Js = ∂t
∑ Ji · Fi , i
(21.50)
41 Here is the meaning of the theorem:
if across a semiconductor there is both a gradient in electric potential and a gradient in temperature, then a charge current will flow due to both gradients, and a heat current will also flow due to both gradients. The charge and heat currents, and hence their coefficients, are linked.
where the generation rate of entropy in the RHS is expressed as a sum of products of forces Fi and currents Ji . A theorem of irreversible thermodynamics states that each current depends on every force41 : ∂S = ∂t
∑ Ji Fi i
=⇒ Ji = ∑ Lij Fj by matrix inversion.
(21.51)
j
The theorem relates the currents that flow in response to the force via linear transport coefficients Lij . Using this theorem, we now formulate
21.7
One formula to rule them all 505
a generalized expression for transport coefficients (Equation 21.55) in response to forces that arise from gradients in the electrochemical potential and temperature, a ”formula that rules them all” so to speak. To find a generalized solution of the BTE in the RTA of the form ∂f0
f k = f k0 − ∂Ek gk in the simultaneous presence of an electric force F, a k magnetic field B, and a temperature gradient ∇r T, let’s use the trial form gk = −τk (vk · G), where the vector G is a generalized electromagneto-thermal force. The electro-thermal forces, being polar vectors, E−µ are clubbed into42 F = ∇r (µ − qϕ) + T ∇r T. The magnetic field, being an axial vector, is given special status. By substituting this trial form into the BTE, making a small perturbation approximation, and using µ = qτ/m? , a generalized force vector G is obtained: G=
1 [ F + µ ( F × B ) + µ2 ( F · B ) B ]. (1 + µ2 | B |2 )
(21.52)
The B = 0 solution is G = F, an extension of Equation 21.42, but now including the temperature gradient explicitly. For B 6= 0 the solution is much richer43 . This solution of the BTE immediately yields the particle current j in response to E = F/Q, B, and ∇r T as the sum j=
1 Ld
∂f0
∑ vk (− ∂Ekk ) gk = k
1 Ld
∂f0
∑ vk (− ∂Ekk )(−τk vk · G), k
=⇒ J/Q = j = [K10 F + K20 (B × F) + K30 B(B · F) ] | {z } | {z } | {z } ∝E
Hall effect
(21.53)
...
Nernst
where the important generalized kinetic coefficients Krs have been defined44 . The corresponding generalized heat current is Jq =
1 Ld
∑(Ek − µ)vk (− k
∂ f k0 )(−τk vk · G) ∂Ek
=⇒ Jq = [ K11 F + K21 (B × F) + K31 B(B · F)] | {z } | {z } | {z } Peltier
Righi-Leduc
(21.54)
...
The kinetic coefficients45 for both charge current J in Equation 21.53 and heat current Jq in 21.54 takes the unified form ij
Krs =
43 Equation 21.52 is worked out in Ex-
ercise 21.9. The general solution with an effective mass tensor represented by a matrix [ M ] instead of the scalar m? is provided there. Note that for charge Q, the force F = QE where E is the electric field.
∂ f k0 vi v j τkr ( Ek − µ)s 1 Q r −1 · ( ) · (− , ∑ ∂Ek ) Ld m? 1 + ( Qτ?k )2 |B|2 k
44 The physical meaning of the terms and
their vector driving forces are explicitly seen here. For example, the charge current flows parallel to the electric field E, but the Hall-current flows along B × E, perpendicular to B and E. A charge current also flows in response to a temperature gradient in the absence of E and B: this is the Seebeck effect, and so on... 45 With this generalized formula we
...
Ettingshausen
∇r T ∇r T ∇r T +[K12 + K22 (B × ) + K32 B(B · ) ]. | {zT } | {z T } | {z T } −κ e ∇ r T
µ
Eext = ∇r ( q − ϕ) is the sum of the externally applied electric field Eext = −∇r ϕ where ϕ is the scalar electric potential, and internal quasi-electric forces due to heterostructures or spatially nonuniform µ Ec doping Eint. = −∇r (−q) = ∇r ( EF − ) q that may be present.
magnetoresistance
∇r T ∇r T ∇r T +[K11 + K21 (B × ) + K31 B(B · ) ], | {zT } | {z T } | {z T } Seebeck
42 The total electric field E = E int. +
(21.55)
m
where the subindices (r, s) appear as powers of the relaxation time τ and energy ( Ek − µ) inside the sum and (i, j) capture its tensor nature.
can calculate experimentally measurable macroscopic kinetic coefficients from the microscopic bandstructure and scattering processes for transport in any dimension. The same formula is applicable to calculate the electrical and thermal conductivities, Hall effect and magnetoresistance, thermoelectric coefficients due to the Seebeck and Peltier effects, and several galvanomagnetic and thermomagnetic effects such as the NernstEttingshausen and Righi-Leduc effects, thus encompassing the rich range of transport phenomena exhibited in semiconductors and their quantized heterostructures. This chapter covers transport when B = 0, and Chapter 25 focuses on magnetotransport.
506 No Turning Back: The Boltzmann Transport Equation
46 Exercise 21.9 generalizes the kinetic
coefficient to anisotropic effective masses represented by the tensor matrix [ M ].
The Ld in the denominator indicates the volume in d-dimensions that cancels upon summation over k. Inside the sum, vi = 1h¯ (∇k Ek ) · kˆ i and v j = 1h¯ (∇k Ek ) · kˆ j are the group velocity components in the i and j directions obtained from the bandstructure, kˆ i is the unit vector in the i direction, Q = −q is the electron charge and m? is the isotropic effective mass46 . The last three kinetic coefficients of the particle current K11 , K21 , and K31 are identical to the first three of the heat current, even though they seem to be different phenomena. These are the famous Onsager relations in action, as we discuss shortly.
21.8 Electrical conductivity
47 Since charge current and heat current
(and a temperature gradient) are interwined by Equations 21.53 and 21.54, this cannot be exactly correct. What is meant here is that there is no external temperature gradient imposed on the system.
Let us consider the simplest situation in the absence of magnetic fields (B = 0) and temperature gradients (∇r T = 0), with an electric field µ E = Eint. + Eext = ∇r ( q − ϕ) as the lone force47 . Before using the ij
coefficients Krs of Equation 21.55, a direct connection to experimentally measured electrical conductivities due to drift and diffusion processes is possible from the RTA solution of Equation 21.42 with a kdependent τk . For the electrical (charge) current J the physical property associated with each k state is φk = −qvk , with vk = h¯ −1 ∇k Ek the group velocity of state k and −q the electron charge identical for all k. The current is J=
1 Ld
∑(−qvk ) f k ≈ − k
q q = − d ∑ vk f k0 + d L k L | {z } | =0
q Ld
∑ vk ( f k0 + τk vk · [(− k
∂f0
∂ f k0 )F + (−∇r f k0 )]) ∂Ek
q
∑ vk (vk · F) ∂Ekk τk + Ld ∑ vk (vk · ∇r f k0 )τk , k
{z
Jdrift
}
|
k
{z
J diffusion
}
(21.56)
where three distinct components naturally result from the RTA:
48 This ”internal” current J may be eval0
uated for any bandstructure and in any direction. It is not due to external fields or forces, but due to thermal energy in the case of non-degenerate carriers, or due to the Pauli exclusion principle for degenerate carriers. Since these currents are equal and opposite in all directions, they cancel to give a net zero current. These directional currents were seen to play a major role in ballistic transport in several earlier chapters.
1. ”Zero” current: The first current term is zero by symmetry because the equilibrium Fermi–Dirac distribution function is even in 0 ), but the group velocity v is odd (v = − v k ( f k0 = f − k k −k ), the k sum of their product over all k is zero. However, we must not forget that the individual ”right”-going component J0R and ”left”-going component J0L of the current are not zero, but equal in magnitude to J0 = (−q)/Ld ∑k>0 vk f k0 · nˆ where nˆ is a unit vector in any direction48 . 2. Drift current: Consider the force F = −qE on an electron due to an electric field E alone. With the group velocity vk = 1h¯ ∇k Ek , the general drift current Jdrift in Equation 21.56 reads Jdrift = [σ ]E, where
21.8
Electrical conductivity 507
the electrical conductivity [σ] is a tensor: ∂E ∂Ek ∂Ek k ∂Ek ( ∂k xk )2 ∂E ∂k ∂k ∂k ∂k x y x z Jx Ex 2 ∂Ek ∂Ek ∂ f k0 q ∂Ek 2 ∂Ek ∂Ek Jy = Ey , (− ) · τk · ∂k y ∂k z ∂ky ∂k x ( ∂ky ) 2 d ∑ ∂E h ¯ L k k ∂Ek ∂Ek ∂Ek ∂Ek Jz Ez k 2 ( ∂E ∂k z ∂k x ∂k y ∂k y ∂k z ) Ex ∂f0 q2 ∂E =⇒ for E = 0 , Jx = [ 2 ∑(− k ) · τk · ( k )2 ] · Ex . d ∂E ∂k x h¯ L k k 0 | {z } σxx =qnµn
(21.57)
The drift current density Jx in response to the electric field is thus simply the scalar conductivity σxx times the field Ex . This conductivity can be obtained for any bandstructure Ek if the scattering time τk is known. The conductivity is further split49 into the product σxx = qnµ of electron charge with the mobile carrier density n and a drift mobility µn . The microscopic scattering processes contribute to the conductivity via the τk and the bandstructure through Ek . 3. Diffusion current: From Equation 21.56 the diffusion current is driven by the real-space gradient of the equilibrium distribution function ∇r f k0 . It is therefore written as Jdiff = [ D ]∇r f k0 with a tensor diffusion constant [ D ], which in terms of all components is: ∂E ∂f0 ∂Ek ∂Ek k ∂Ek k ( ∂k xk )2 ∂E ∂k x ∂k y ∂k x ∂k z Jx ∂x ∂Ek ∂Ek q ∂Ek ∂Ek ∂ f k0 k 2 Jy = ( ∂E τ · ∂k y ) ∂k y ∂k z ∂y , 2 d ∑ k ∂k y ∂k x h¯ L k ∂Ek ∂Ek ∂Ek ∂Ek Jz k 2 ∂ f k0 ( ∂E ∂k z ∂k x ∂k y ∂k y ∂k z ) ∂z 0
=⇒ for
∇r f k0
49 This splitting is not just a matter of
convenience, but of direct physical consequence in semiconductors. In diodes and transistors the mobile carrier density n changes by several orders of magnitude while in operation, whereas the mobility µn hardly changes at all.
∂ fk
1 ∂E 1 ∂f0 ∂x = 0 , Jx = q · ∑ τk · 2 · ( k )2 ·( d · k ). h¯ {z ∂k x } |L {z∂x} k | 0 v2 τ ∼ D
∼ dn dx
(21.58)
The diffusion current is obtained if Ek and τk are known, both of which are identical to their values for the drift current. This microscopic connection manifests as the macroscopic Einstein relation, which is seen to emerge immediately if we connect the formulations of drift and diffusion currents in Equations 21.57 and 21.58 to the generalized coefficients in Equation 21.55. For B = ∇r T = 0, only the first term survives50 in Equation 21.53, which in terms of coefficient K10 gives: µ J = −qK10 E = −qK10 ∇r ( − ϕ) = q2 K10 Eext. + qK10 (−∇r µ) q | {z } | {z } qnµn Eext.
=⇒ µn =
qDn ∇r n
qK10 −∇r µ Dn k T and Dn = K10 =⇒ = b . (21.59) n ∇r n µn q
50 In the non-degenerate approxima-
tion n = Nc exp[( EF − Ec )/k b T ] = Nc exp[µ/k b T ] which gives ∇r µ = (k b T/n)∇r n for the diffusion component. The case of general degeneracy is given in Equation 21.62. The Einstein relation D/µ = k b T/q is an example of a broad class of fluctuation-dissipation theorems of statistical mechanics.
508 No Turning Back: The Boltzmann Transport Equation
Mobilities, anisotropic bandstructures, and scattering: The above formulations indicate currents could also flow in directions in which there are no forces! But this only occurs for anisotropic energy bandstructures of particular kinds. For ellipsoidal bandstructures of type Ek = h¯ 2 /2(k2x /m xx + k2y /myy + k2x /mzz ) which is the diagonal case of 2
2
h¯ h¯ Ek = [ k ] T [ M ] −1 [ k ] = 2 2
51 This is indeed the case for most con-
duction bands: Si and Ge have ellipsoidal conduction band minima. Most III-Vs have spherical minima. Such luxury is not available for valence bands: for them spherical/ellipsoidal minima are a poor approximation.
52 In the non-degenerate approximation
η −1 the Fermi–Dirac integrals Fj (η ) ≈ eη , which reduces to Equation 21.59. For the degenerate approximation n η +1, the Einstein relation is D µn = 2 d
·
( EF − Ec ) q
for diffusion in d-dimensions.
53 The identity (− ∂ f k0 ) = ∂E k
1 E −µ 4k b T cosh2 [ 2kk T ] b
is used here with u = Ek /k b T and η = µ/k b T = ( EF − Ec )/k b T. For highly de∂f0
η
with η −1, (− ∂Ek ) ≈ ke T . If, as k b shown in Chapters 22 and 23, the scattering time can be expressed as τk = E τ0 ( k kT ) p , the expression for the mobility b is obtainable in terms of Gamma functions and the Fermi–Dirac integrals on the right of Equation 21.63.
1 m xx [k x k y k z ] m1yx 1 mzx
1 m xy 1 myy 1 mzy
1 m xz 1 myz 1 mzz
kx k y , (21.60) kz
the off-diagonal terms of drift and diffusion currents in Equations 21.57 and 21.58 of the form (∂Ek /∂k x )(∂Ek /∂k y ) = h¯ 4 k x k y /(m xx myy ) vanish upon summation over all k on account of being odd around the k x , k y axes. The electrical conductivity and mobility (and the related ij diffusion constant) and the kinetic coefficient K10 are therefore diagonal (and simpler!) for ellipsoidal bandstructures51 . For electrons of density n = L1d ∑k f k0 free to move in d-dimensions, they are 2
q2 = d L
∂ f k0 2 σ (− ∑ ∂Ek )v τk =⇒ µd = qnd = k
q2 Ld
∂f0
∑k (− ∂Ekk )v2 τk
. ∑k f k0 (21.61) Because the d-dimensional carrier concentration is n = Nc F d −1 (η ), 2 the generalized Einstein relation for any degeneracy becomes52 σd = q K10
µn =
kb T qK10 and Dn = K10 =⇒ Nc F d −1 (η ) Nc F d −2 (η ) 2
2
q Ld
Dn k T F d2 −1 (η ) = b . µn q F d −2 ( η ) 2
(21.62) A non-degenerately doped semiconductor at 300 K of mobility µn = 1000 cm2 /V·s has a diffusion constant Dn = µn kbqT = 26 cm2 /s. For n = 1017 /cm3 and |Eext. | = 1 kV/cm the drift current density is Jdri f t = 16 kA/cm2 . For a concentration gradient of ∆n = 1017 /cm3 over a distance ∆x = 10−4 cm, the diffusion current is Jdi f f . = 4.16 kA/cm2 . To proceed further in the evaluation of electron mobilities, it is necessary to know the energy bandstructure Ek and the scattering time τk of the specific microscopic processes. For parabolic bandstructure Ek = h¯ 2 k2 /2m? , using the kinetic coefficient K10 , a generalized expression for electron mobility for transport in d-dimensions is obtained53 :
∂f0
generate cases with η +1, (− ∂Ek ) ≈ k δ( Ek − µ), and for non-degenerate cases
µd =
R∞
du
d
τu ·u 2 u−η cosh2 [ 2 ]
q 0 qτ0 Γ( 2d + p + 1) F d2 + p−1 (η ) = . m? 2Γ( d2 ) F d (η ) m? Γ( d2 + 1) F d −1 ( η ) −1 2
(21.63)
2
For the case of d = 3 and η +1, representing highly degenerate electron gases similar to a metal, the above expressions simplify on acnq2 τ qτ count of the Dirac delta function, giving σ ∼ m? F and µ ∼ m?F , results
21.8
similar to the Drude model for a metal. For non-degenerate situations and the general case, a knowledge of k- or energy-dependence of τk is necessary. This is as far as one can proceed without this knowledge. The evaluation of individual scattering rates τk expressed as τu in Equation 21.63 is the topic of Chapter 22 for phonon scattering and Chapter 23 for defect and impurity scattering54 . For the rest of this section, we discuss general features, and the nature of τ that is obtained from the RTA. We focus on three types of relaxation times: the quantum τq , momentum τm , and energy τE relaxation times that represent physically different time constants. Before providing a derivation, they are simple to state and to understand: 1 = τq (k) 1 = τm (k)
∑0 S(k → k0 ) k
∑0 S(k → k0 )(1 − k
1 = τE (k)
[Quantum Relaxation Rate] E · v k0 ) E · vk
∑0 S(k → k0 )(1 − k
Ek0 ) Ek
Electrical conductivity 509
54 For example in a 2D electron gas of
m? = 0.067 and scattering time τF ∼ 0.3 ps, the mobility is ∼ 8, 000 cm2 /V·s and the sheet-resistance 1/σ is ∼ 800 Ω/2, fairly representative of 2DEGs in AlGaAs/GaAs heterostructures at room temperature, where phonon scattering limits the τF . At low temperatures τF can increase dramatically in clean samples from τF ∼ 0.3 ps to τF ∼ 300 ps or more as phonons are frozen out, boosting the mobility to 8, 000, 000 cm2 /V·s, and reveal rich quantum transport phenomena such as quantized conductance and the quantum Hall effect, discussed in Chapter 25.
[Momentum Relaxation Rate] [Energy Relaxation Rate]. (21.64)
The relaxation rate for state k is a sum over the scattering ”out” transition rates S(k → k0 ) from that state to all other possible states k0 , obtained by Fermi’s golden rule separately for each type of scattering mechanism. The quantum and energy relaxation rates are properties of the k states alone. The momentum scattering rate depends on k, and can also depend on the direction of the electric field E in anisotropic bandstructures. The forms in Equation 21.64 can be suitably amended to account for the Pauli exclusion principle when needed, as will be demonstrated when we use them. In the presence of scattering, the quantum relaxation rate is never zero. If the momentum represented by the group velocity vk does not change upon scattering, then the momentum relaxation rate is zero for that event. Similarly, if the energy does not change in a scattering event, Ek = Ek0 makes the energy relaxation rate zero, which occurs for elastic scattering events. Fig. 21.15 illustrates the scattering times. Due to ”quantum” scattering, after time τq the electron distribution loses memory that they were all pointing in exactly the same direction, but still have memory of the net initial momentum. The distance propagated in this time is lq ∼ vτq . After the momentum relaxation time τm and a distance lm ∼ vτm , the memory of the net initial momentum is erased, but the electron system still has the memory of the initial energy as represented by the lengths of the arrows. After the energy relaxation time τE and length lE ∼ vτE , the electrons have scattered and lost their initial energy, and come to equilibrium with the crystal and the defects and vibrations. Typically τq < τm < τE , and they are typically of the order of picosecond at room temperature. These values can change significantly depending on the situation. Let us now directly obtain the momentum relaxation rate from the RTA solution of the BTE.
Fig. 21.15 The three relaxation times shown schematically in a space-time diagram for several electrons injected into a semiconductor as a collimated beam at time t = 0. The direction and length of the arrows represent the momentum vector for each electron. The quantum, momentum, and energy relaxation times, and the corresponding mean-free paths are indicated.
510 No Turning Back: The Boltzmann Transport Equation
For elastic scattering the electron energy Ek = Ek0 is the same before and after scattering and therefore S(k0 → k) = S(k → k0 ) from the principle of microscopic reversibility. The collision term in the RHS of ∂f the BTE ∂tk |coll. = − ∑k0 S(k → k0 )[ f k − f k0 ], rewritten as ∂ fk | + f k [∑ S(k → k0 )] = ∂t coll. k0 | {z }
∑0 S(k → k0 ) f k0 ,
(21.65)
k
1/τq (k)
shows the meaning of the quantum relaxation rate as the sum of the scattering rates 1/τq (k) = ∑k0 S(k → k0 ) from a state k to all other states k0 . The corresponding quantum scattering time τq (k) may be viewed as a ”lifetime” of the particle in the state |ki. A particle in this state |ki at time t = 0 will scatter out into other states |k0 i due to collisions. The occupation function f k of that state will approach the equilibrium value f k0 exponentially fast as ∼ exp[−t/τq (k)] with the time constant τq (k) when external fields or perturbations are removed. Consider the steady state case in the presence of an electric field that produces a force −qE, making the solution of the BTE in the RTA ∂f0
f k = f k0 + qτk vk · E ∂Ek . For elastic scattering |k| = |k0 | which makes k f k0 = f k00 and τk = τk0 . The momentum relaxation rate is then obtained:
∂ fk | = ∂t coll.
=⇒
Fig. 21.16 Angular relations between k, k0 , and E vectors in the Boltzmann transport equation.
∂ f k0 E · vk0 0 S ( k → k )[ f − f ] & f − f = qτ v · E [1 − ] 0 0 k k k k ∑0 k k ∂E E · vk k k | {z } f k0 )
( fk − ∂ fk |coll. = − ∂t τm (k)
where
1 = τm (k)
f k − f k0
∑0 S(k → k0 )[1 − k
E · vk0 ]. E · vk
Since f k − f k0 does not depend on k0 , it is possible to pull it out of the summation, leading to a clean definition of the momentum relaxation rate in the elastic scattering approximation. The formulation further simplifies if the angular relation between the three vectors: the group velocities vk , vk0 and the electric field E are considered, as shown in Fig. 21.16 for a parabolic bandstructure, when vk = h¯ k/m? and vk0 = h¯ k0 /m? . Fixing the z-axis along k and the y-axis so that E lies in the y − z plane, from Fig. 21.16 the relation k0 · E cos β = = cos θ + sin θ sin φ tan α k·E cos α
(21.67)
is obtained (this is the addition theorem of spherical harmonics). When the sum over all final states k0 is performed, the term containing sin φ vanishes by symmetry, simplifying the momentum relaxation time τm (k) to a form that is used heavily in evaluating scattering rates: 1 = τm (k)
∑0 S(k, k0 )(1 − cos θ ), k
(21.68)
(21.66)
Thermoelectric properties 511
21.9
where θ is the angle between k and k0 . Though derived in 3D, it holds for other dimensions. The momentum relaxation time is what determines the electrical conductivity and mobility via µ = qhτm i/m? suitably averaged over all states hτm i as derived in Equation 21.63. The 1 − cos θ weights lead to a large value for back-scattering when θ → π, making τm small and thus lowering the mobility. For small angle scattering when θ → 0, the factor can be much smaller than unity: this increases τm and results in a higher mobility55 . The quantum scattering rate 1/τq (k) = ∑k0 S(k → k0 ) and the momentum scattering rate 1/τm (k) = ∑k0 S(k → k0 )(1 − cos θ ) are both experimentally accessible quantities56 , and provide a valuable method to identify the nature of scattering mechanisms. The momentum scattering time τm (k) measures the average time spent by the particle moving along the external field. It differs from the quantum lifetime due to the cos θ term. For scattering processes that are isotropic, S(k, k0 ) has no angle dependence, the cos θ term sums to zero when integrated over all θ, and τq = τm . However, for scattering processes that favor small angle (θ → 0) scattering, from the definitions, τm > τq . For example, Coulomb potentials heavily favor small angle scattering, whereas short-range scatterers such as uncharged point defects are isotropic scatterers, as discussed further in Chapter 23. The difference between the quantum and momentum scattering times is used to distinguish between different scattering mechanisms in experiments.
55 Examples are discussed in detail in
Chapters 22 and 23 for various scattering mechanisms.
56 Typically the momentum scattering
time is concluded by either a Hall-effect or field-effect measurement of the mobility. The quantum scattering time is obtained in low-temperature magnetotransport quantum oscillations such as the Shubnikov–de-Haas oscillations of longitudinal magnetoresistance, as discussed in Chapter 25. +
-
1.0 0.8 0.6 0.4 0.2
21.9 Thermoelectric properties Fig. 21.17 schematically explains how a temperature gradient across a semiconductor drives an electrical current, or generates a voltage across it. The energy band diagram shows that the Fermi function f h has a longer high-energy tail in the hot region on the left side, and a smaller tail in f c for the cold region on the right. There is a preferential motion of electrons at higher energies to the right, each transporting an energy of amount Ek − µ ∼ k b Th from the left to the right side, and a charge of magnitude q. Similarly, an energy of amount k b Tc is transported from the right to the left. If the electrons were to accumulate in the electrodes at the two ends, a voltage difference of magnitude V ∼ kqb ( Th − Tc ) = kqb ∆T builds up, which is ∼ 86 µV for ∆T = 1 K. The sign of the voltage is sensitive to the direction of the temperature gradient and the charge of the carrier: it can distinguish between n- and p-type semiconductors. If the left and right sides are brought closer in space, the same voltage difference will lead to a larger internal electric field proportional to the temperature gradient ∆T ∆x . Completing the outer circuit will drive a current proportional to the temperature gradient, J ∼ dT dx . The Fermi function difference f h − f c in Fig. 21.17 (c) shows that most of the thermoelectric current is carried by electrons close to Ek ≥ µ = EF − Ec . Designing large DOS in this energy window is a strategy to maximize thermoelectric effects.
0.0 0.0
0.5
1.0
1.5
2.0
0.5
1.0
1.5
2.0
1.0 0.5 0.0
- 0.5 - 1.0 0.0
Fig. 21.17 (a): The energy band diagram, (b): the distribution functions f h , f c , and (c): the driving force for thermoelectric transport, the difference of the distribution functions f h − f c . The semiconductor with band edges Ec , Ev , and energy gap Eg is hot on the left side and cold on the right, creating a temperature gradient dT The solid lines of dx = ∇r T. (b) and (c) are when there is a temperature gradient but no externally applied voltage difference. The dashed and dotdashed lines indicate f h , f c and f h − f c when an external voltage difference is also present in addition to the temperature gradient.
512 No Turning Back: The Boltzmann Transport Equation
1.0 0.8 0.6 0.4 0.2 0.0
1.0 0.8 0.6
∂f0
0.4 0.2 0.0
1.0 0.8
G( ∂Ek ). Fig. 21.18 (a) shows the RTA solution of the BTE for a degenerk ate equilibrium electron distribution f k0 , which shifts by ∆k = −qτE/¯h in the k-space in response to an electric field E. Fig. 21.18 (b) shows that the response to a temperature gradient ∇r T is different. The ∆k is obtained by expressing the RTA solution as a Taylor expansion of the type f k = f k0 − ∆k · ∇k f k0 to get ∂f0 τ ( Ek − µ) ( E − µ)τ v · ∇r T k =⇒ ∆k = − k ∇r T, T ∂Ek h¯ T (21.69) which for small ∆k follows f k = f k0 − ∆k · ∇k f k0 ≈ f k0 (k − ∆k). This means the equilibrium Fermi–Dirac distribution is shifted by ∆k. But for a temperature gradient, ∆k from Equation 21.69 is dependent on the sign of Ek − µ = Ek − ( EF − Ec ) for the conduction band for example. The case for a degenerate electron distribution is shown in Fig. 21.18 (b). The temperature gradient ∇r T < 0 because the left side is hot. Then, for right-going electrons k > 0 and energies above the Fermi level Ek ≥ µ, the sign is ∆k > 0. For right-going electrons and Ek ≤ µ, ∆k < 0, and all the other shifts of f k from the equilibrium value f k0 can be reasoned in this way. By thus breaking the symmetry of f k0 , the temperature gradient leads to a net flow of electrons from the left to the right, and a charge current flow from the right to the left. If the semiconductor is non-degenerate, the electron energies Ec + Ek > EF implies Ek − µ > 0 at all k values, and thus the distribution function shifts in a manner similar to the electric field, as shown in Fig. 21.18 (c). This is a crucial difference between degenerate and non-degenerate semiconductors. Loosely speaking, thermoelectric effects are stronger for non-degenerate semiconductors. Since charge and heat currents are coupled, the thermoelectric effects are obtained directly from the coupled charge and heat transport Equations 21.53 and 21.54, which for B = 0 reduce to f k = f k0 +
0.6 0.4 0.2 0.0
These qualitative thermoelectric features are now investigated quantitatively using the RTA solution of the BTE. Equations 21.53 and 21.54 show that the three forces that drive charge and heat current are the electric field E, temperature gradients ∇r T, and magnetic field B. In the last section, we investigated only the charge current flow in response to an electric field assuming B = ∇r T = 0. For thermoelectric effects we retain the temperature gradient ∇r T but keep B = 0 in this chapter (Chapter 25 covers thermoelectric phenomena for B 6= 0). The electro-thermal force vector for B = 0 from Equation 21.52 is E −µ G = −qE + ( kT )∇r T. Consider first the effects of electric field and thermal gradient separately in the RTA solution f k = f k0 + τk vk ·
Fig. 21.18 Difference between the effects of an electric field (a) and thermal gradient (b) on the distribution function for degenerate electron distributions. In (c) is the case for non-degenerate distribution. In this case because the Fermi level is in the energy gap, Ek − µ > 0 which does not switch sign. Note that µ = EF − Ec for the conduction band. The labels of ”Hot” and ”Cold” are in real space, so the carriers with +k wavevector are from the left (Hot) side.
j=
J ∇r T ∇r T = K10 E + K11 , and Jq = K11 E + K12 . (−q) T T
(21.70)
In experiments, it is easier to control and measure an electric current J than the internal electric field E. Expressing E and the heat current Jq
Thermoelectric properties 513
21.9
in terms of the current J and the temperature gradient gives 57 To avoid excessive symbols we use S
1 K E=( ) J + (− 11 ) ∇r T =⇒ E = ρJ + S∇r T, and −qK10 K10 T | {z } | {z } S
ρ
K12 − K Jq = ( 11 ) J + ( −qK10 T | {z | {z }
−κ e
Π
2 K11 K10
) ∇r T =⇒ Jq = ΠJ − κe ∇r T, (21.71) }
where the boxed equations define the four transport coefficients: the electrical resistivity ρ = σ1 , the Seebeck coefficient57 S, the Peltier coefficient Π, and the electronic thermal conductivity κe . The electrical resistivity is the inverse of the electrical conductivity which was discussed in the last section. We focus on the interrelatedness of these three thermoelectric coefficients, and their practical applications.
for the Seebeck coefficient assuming that the context of usage distinguishes it from the scattering rate S(k → k0 ) and entropy S . Though it must be mentioned that the Seebeck coefficient has a deeper connection to entropy: in magnitude it is the entropy per unit charge S = S /nq where n is the carrier density. The Seebeck coefficient is also called the thermopower. 20 15 10 5 0
Seebeck coefficient: Writing the charge current as J = σ (−∇r V ) − σS∇r T, and from Fig. 21.17 (a) considering the open-circuit situation when J = 0, the Seebeck coefficient becomes S = − ∆V ∆T , the voltage difference resulting from a temperature difference. This is how it is experimentally measured. The values reached in lightly doped semiconductors around room temperature can be as high as S ∼ 10 kqb ∼ 1 mV/K, but in heavily doped semiconductors and metals it is an order of magnitude lower. The reason becomes clear if the Seebeck coefficient is directly calculated using S = −K11 /K10 T from Equation 21.71 and the generalized kinetic coefficients from Equation 21.55: S=−
K11 1 =− qK10 T qT
∂f0 ∑k (− ∂Ekk )v2k τk ( Ek − µ) ∂f0 ∑k (− ∂Ekk )v2k τk
=−
h Ek − µi , (21.72) qT
k Ec − EF d + 2 k d π2 kb T S |{z} ≈ − b[ + + r ] and S |{z} ≈ − b [ + r] . q kb T 2 q 2 3 Ec − EF | {z } | {z } η −1 η +1 −η
−1/η
(21.73) Fig. 21.19 (a) shows the Seebeck coefficient and (b) shows the electrical conductivity as a function of the Fermi level. When the Fermi level is deep inside the bandgap, the Seebeck coefficient is high58 , but the electrical conductivity is low because there are very few carriers. The electrical conductivity increases when the Fermi level enters the conduction band due to a large increase in carrier density, but the Seebeck
10
10
20
20
10
10
20
20
10
10
20
1.5 1.0 0.5 0.0
0.5 0.4 0.3 0.2 0.1 0.0
where the angular bracket h Ek − µi indicates the kind of weighted averaging indicated by the sum over all k states. This is a generalization of what is commonly referred to as the Mott formula for the Seebeck coefficient. Evaluating the above sum for parabolic bands in d-dimensions assuming τk = τ0 ( kEkT )r , we get for the non-degenerate b η −1 and degenerate η +1 limits (see Exercise 21.11):
20
Fig. 21.19 (a) The Seebeck coefficient S, (b) the electrical conductivity σ, and the power factor S2 σ for a semiconductor plotted in dimensionless units as a function of η = ( EF − Ec )/k b T, the location of the Fermi level with respect to the conduction band edge.
58 For non-degenerate semiconductors,
the Seebeck coefficient is dominated by the η = ( EF − Ec )/k b T term, which can 2 be much larger than the d+ 2 + r term which is of the order of unity. For this k case one can approximate |S| ≈ qb |η |. The general expression for all values of η is S = −
kb q
[−η + ( d+2 2 + r ) F
F
r+ d 2
(η )
(η ) r+ d 2 −1
].
514 No Turning Back: The Boltzmann Transport Equation
Fig. 21.20 Seebeck and thermocouple effect. The voltage Vth that develops at the electrode terminals is proportional to the temperature difference T2 − T1 .
coefficient plummets because k b T is small compared to the degeneracy EF − Ec . The product S2 σ thus has a maximum when the Fermi level is located close to the band-edge. This product turns up as the power factor in thermoelectric materials, as discussed shortly. The Seebeck effect is the principle of operation of thermocouples that produce a voltage proportional to a junction temperature. The thermocouple arrangement also allows the measurement of the Seebeck coefficient of semiconductors. Consider Fig. 21.20, in which a semiconductor is connected to two metal contacts in an open circuit. Both metal leads are made of the same metal, and the terminals are held at the same temperature T0 . If the two metal/semiconductor contacts are at temperatures T1 and T2 , a thermal voltage of
−Vth =
I
SdT =
Z T 1 T0
=
Smet dT +
Z T2 T1
Z T2 T1
Ssemi dT +
Z T0 T2
Smet dT
[Ssemi − Smet ]dT ≈ Ssemi ( T2 − T1 ) (21.74)
develops across the metal terminals, which can be measured and tabulated for various T2 − T1 to give the Seebeck coefficient of the semiconductor Ssemi . The integral is taken in the clockwise loop as indicated in Fig. 21.20. Typically the Seebeck coefficient of semiconductors is much larger than that of metals Smet , but the approximation Ssemi − Smet ≈ Ssemi is not necessary if the Seebeck coefficient of the metal is known. 59 That the Peltier and Seebeck coeffi-
cients are related in this way is one of the Onsager relations which is discussed in the next section. For non-degenerate semiconductors, the Peltier coefficient is Ec simply |Jq |/|J| = Π = ST ≈ EF − , q which quantitatively verifies the prior qualitative discussion that Ec − EF of heat is transported by charge −q.
Peltier coefficient: Closely related to the Seebeck effect is the fact that the flow of an electrical current, in addition to transporting charge, also transports heat, even in the absence of an externally enforced temperature gradient. This is the Peltier effect, which relates the heat current to the charge current via Equation 21.71, which for ∇r T = 0 11 is Jq = ΠJ, where the Peltier coefficient Π = K K10 = ST is simply the Seebeck coefficient times the temperature with the unit of volts59 . Fig. 21.21 shows a charge current J flowing across a junction of materials of different Peltier coefficients. Because Jq1 = Π1 J and Jq2 = Π2 J, we have Jq1 − Jq2 = ∆Jq = (Π1 − Π2 )J. This implies that a Peltier coefficient difference will produce heating at the junction when the charge current flows in one direction. But when the charge current flows in the opposite direction, because ∆Jq changes sign, the junction will experience cooling! This is the principle of Peltier heating and cooling. Equations 21.50 and 21.71 give the net generated heat as 1 Q = J · E − ∇r · Jq = J · [ J + S∇r T ] − ∇r · [ΠJ − κe ∇r T ] σ
Fig. 21.21 Peltier heating and cooling.
=
| J |2 σ |{z}
Joule heat↓
dS − T J · ∇r T + ∇r · [κe (∇r T )], {z } | dT {z } | Thomson heat↑
Thermal cond.↓
ZT =
S2 σT , (21.75) κe + κl
where the net heat produced is split into three terms. The first term is the familiar Joule heat, and the third term is the heat due to the
21.9
electronic part of the thermal conductivity gradient. The second term called the Thomson heat is unconventional: unlike Joule heat, which is always positive, the Thomson heat switches sign when the current switches direction. This heat is either released, or absorbed from the environment depending on the direction of the current! This is rather remarkable, and can be put to clever use for practical purposes, such as a single device that operates as a Peltier refrigerator and oven, as shown in Fig. 21.22. The principle of operation is similar to Fig. 21.21, but now there are several junctions, and n- and p-type semiconductors are used to boost the Peltier coefficient difference. The most efficient operation will be achieved if the Thomson term in Equation 21.75 is maximized, and the Joule and thermal conductivity terms are minimized. This can be achieved with a high Seebeck coefficent, a large electrical conductivity, and a low thermal conductivity, and are embodied in the factor ZT of Equation 21.75. The power factor S2 σ appears in the numerator, and the total thermal conductivity in the denominator that includes the electronic (κe ) and lattice (κl ) parts. That ZT is an important thermoelectric figure of merit is seen by writing a simplified version of Equation 21.75 for one of the legs of the Peltier cooler in Fig. 21.22 of length L, cross-section A, thermal conductivity κ and resistance R = σ1 AL as Qc LA = Pc = I 2 R − SIT + dS κ AL ∆T, where we have used dT ∼ TS for a non-degenerate semiconductor. The heat extraction from the cold side is maximized for the current I for which ∂Pc /∂I = 0, which is I = ST/2R. Substituting 2 2 back we get Pcmax = κ AL ∆T − S4RT , which when zero maximizes the temperature difference between the hot and cold sides. This difference 2 2 ∆T = 14 · Sκσ · T 2 indicates that maximizing the factor Z = Sκσ indeed maximizes the thermoelectric performance of the device. If instead of flowing a current, the device was placed between a hot and cold surface, a voltage will be generated across the two leads, converting the temperature difference Th − Tc into a voltage, making it a thermoelectric battery. The use of the alternating n- and p-semiconductors will then, via the Seebeck effect, generate a voltage Vn = −Sn ( Th − Tc ) and Vp = −S p ( Tc − Th ) across each stage, and for N × n-p stages, they will add in series to produce an open circuit voltage of magnitude |Voc | ∼ N ( Th − Tc )(|Sn | + |S p |). ZT, electronic and thermal conductivity, and Lorenz number: To investigate the variation of ZT in various materials, we discuss the electronic and thermal conductivities. In the presence of a temperature gradient, if there is no charge current flow (J = 0), Equation 21.71 shows that there is still a heat current Jq = −κe ∇r T carried by electrons. In metals, most of the heat current is carried by electrons, but in semiconductors, a large portion of the heat current is carried by lattice vibrations or phonons (discussed in Chapter 22), and electrons contribute only a very small part due to their low density. This small portion defines the electronic part of the thermal conductivity which in 2 /K ) /T. terms of the generalized kinetic coefficients is κe = (K11 − K11 10
Thermoelectric properties 515
Fig. 21.22 A semiconductor based solidstate Peltier cooler and heater. Current flowing in one direction heats the upper surface and cools the lower, and vice versa upon switching the current direction. The metal/semiconductor/metal structures are connected in series, the semiconductors alternating between nand p-type. The top and bottom metal electrodes are in thermal contact with different plates. The device is electrically in series, but thermally in parallel.
Fig. 21.23 Abram Ioffe, the physicist who early on recognized the potential of semiconductors for thermoelectric power conversion, among several other contributions to physics.
516 No Turning Back: The Boltzmann Transport Equation
Just as for the classical Drude model in Chapter 2, the ratio of the electronic part of the thermal conductivity and the electronic conductivity defines a Lorenz number (see Exercise 21.12 for the exact form):
5 4 3 2
L=
1 0
-5
5
10
15
20
κe 1 K K h( Ek − µ)2 i − h Ek − µi2 = 2 2 [ 12 − ( 11 )2 ] = =⇒ σT K10 q T K10 q2 T 2
L |{z} ≈ (r + η −1
Fig. 21.24 Lorenz number for electron transport in d = 2 dimensions.
Fig. 21.25 Thermoelectric ZT values vs. temperature in ◦ C for various n-type and p-type thermoelectrics.
60 A
fascinating application is the radioactive thermoelectric generator, which sips off the nuclear fuel of radioactive materials released as heat, and converts it to electricity. They last as long as the half-life of the radioactive materials, and thus can be ”self-powered” for 100s of years. They are thus used in places that are prohibitive for human life. Such thermoelectric generators have helped explore the parts of the solar system where solar energy is not available, and is one of the few human-built devices that has gone beyond the solar system in the Voyager spacecraft.
d+2 k ) · ( b )2 2 q
and
π2 kb 2 L |{z} ≈ ·( ) , 3 q
(21.76)
η +1
where d is the dimensions the charge moves in, and r as before for the Seebeck coefficient appears as the energy exponent of the scattering time τk = τ0 ( kEkT )r . Fig. 21.24 shows the Lorenz number for b electron transport in two-dimensions which occurs in MOSFETs and in 2D electron gases in semiconductor quantum wells. When the car2 rier density is degenerate, the Lorenz number becomes π3 ( kqb )2 which is independent of the scattering mechanisms and also the dimension. This follows the Wiedemann–Franz law which is similar, but not quite in the same way as the classical formulation as discussed in Chapter 2. On the other hand for the non-degenerate case the scattering mechanism and the dimensionality both determine the asymptotic value of (r + d+2 2 )( kqb )2 approached for η −1. A large ZT is thus desirable for both Peltier cooling/heating as well as a thermoelectric battery. Using the Lorenz number L, the ZT of any 2 material is ZT = SL , which is dimensionless. The value of ZT of metals is low, because they have low S and high κe which outweighs their high σ. Since the thermal conductivity in metals is nearly entirely due 2 2 to electrons, ZT ∼ SL ∼ π3 ( kbµT )2 1, making them unsuitable for thermoelectric energy conversion, or cooling/heating. Semiconductors and semimetals on the other hand boast higher ZT values, some close to unity. The search for materials that have ZT ∼ 2 − 3 continues, and will have revolutionary impact if this is successful. The need of low κ and high σ makes heavy-atom semiconductors/semimetals from the lower part of the periodic table the most attractive (Fig. 21.25). The currently available ZT values restrict the efficiencies of thermoelectric energy conversion efficiency to less than 10%. Because this is lower than heat engines based on moving gases or fluids, they have not made mainstream impact yet. Nevertheless, semiconductor and semimetal-based thermoelectrics have already found several uses that effectively exploit their special traits. Because of movement of electrons and holes (instead of molecules), they are compact, fast, and because they have no ”moving parts”, they have almost infinite lifetime. Some of the popular thermoelectric materials are doped Bi2 Te3 /Sb2 Te3 and Si/SiGe superlattices. They are used for precision temperature control with fast feedback for controlling semiconductor lasers and other devices which are sensitive to temperature. They have enabled compact electric beverage heaters and coolers, car-seat and windscreen heaters, and have several automotive and space applications60 .
21.10 Onsager relations 517
21.10 Onsager relations In the preceeding sections we have seen that a generalized current in the direction i in response to a generalized force Fj in the direction j is a sum Ji = ∑ j Lij Fj , which written in the matrix form is J1 L11 L12 L13 . . . F1 J2 L21 L22 L23 . . . F2 (21.77) J3 = L31 L32 L33 . . . F3 . .. .. .. .. .. .. . . . . . .
In Equation 21.51 we have seen that the corresponding rate of change of entropy is ddtS = ∑i Ji Fi = ∑ij Fi Lij Fj . Since the entropy should increase with time in the approach towards equilibrium ddtS > 0, the square matrix of the kinetic coefficients Lij should be positive definite, which means all its eigenvalues are positive. Onsager (Fig. 21.26) showed that when there is no magnetic field (B = 0), in addition to being positive definite, the matrix of transport coefficients is also symmetric, that is Lij = L ji . In the presence of a magnetic field, time reversal symmetry implies that the relation Lij (+B) = L ji (−B) must hold. This is the Onsager relation, which he derived from statistical mechanics, and by invoking the reversibility of the microscopic laws of motion governing the particles of the system. We encountered the first Onsager relation in thermoelectric phenomena. The Seebeck coefficient S and the Peltier coefficient Π were found to be connected by the relation Π = TS via the kinetic coefficients61 derived in Equations 21.53, 21.53, and 21.55, and then applied in Equation 21.71. This was already realized in Equations 21.53, where the kinetic coefficients of the last three charge current components related to the forces due to temperature gradients and magnetic fields were found to be exactly the same as the first three components of the heat current in Equation 21.54 due to electric and magnetic fields and not due to thermal forces62 . Long before Onsager’s derivation, the thermoelectric relation between the Seebeck and Peltier coefficients was known as the Kelvin relation after Lord Kelvin (William Thomson), who had derived and explained it from thermodynamic considerations. The above matrix for charge current J and heat current Jq in three dimensions in response to the three forces E, B, and ∇r T thus forms a 6 × 6 Lij matrix, characterized by 36 kinetic coefficients of the type in Equation 21.55. Because of Onsager’s relations, the 15 off-diagonal matrix elements are symmetric, implying 21 independent kinetic coefficients. Because all currents are interrelated through this matrix, there are several more Onsager relations in the presence of magnetic fields, which go by the names of researchers that discovered them experimentally before Onsager explained why they were the same. The several Onsager relations in the presence of a magnetic field are discussed for thermo-magnetic transport phenomena in Chapter 25.
Fig. 21.26 Lars Onsager formulated the reciprocal relations for kinetic coefficients from fundamental thermodynamic arguments. He also provided the exact solution of the 2D Ising model and realized that quantum oscillations of magnetic susceptibility in the deHaas–van Alphen effect as a means to measure the Fermi surfaces of metals. Onsager was awarded the 1968 Nobel Prize in Chemistry.
61 If instead of defining the driving force
as ∇r T, we had used ∇r ( T1 ), then the identity L12 = L21 would be obtained.
62 Thus, one can loosely say that the On-
sager relations imply that the ratio of a electrical particle current to the temperature gradient that drives it, is identical to the ratio of the corresponding heat current to the electrical gradient that drives it. It connects phenomena that naively may seem distinct, but at the microscopic level are connected because of the reversibility of the dynamical laws that govern their motion.
518 No Turning Back: The Boltzmann Transport Equation
21.11 Conservation laws Table 21.1 Moments of the BTE and the corresponding conservation laws. φk Particle# 1 Energy Ek Entropy sk
Conservation Law ∂n ∂t
+ ∇r · j = 0
∂ue ∂t
+ ∇r · Jue = F · J − R
∂se ∂t
+ ∇r · Jse = Gse
We have investigated the steady-state solution of the BTE in response to DC electric or magnetic fields, or temperature gradients. This is insufficient for forces that oscillate in time. A solution method for oscillating forces for the BTE will now be developed. In addition to its use in later chapters on magnetotransport and photonic properties of semiconductors, it will illuminate and strenghten features of the BTE by connecting to conservation laws that were encountered earlier. The general principle is called taking ”moments” of the BTE. Using f −f0
∂f
the RTA solution for the collision term ∂tk |coll. ≈ − k τ0 k , multiplying the BTE by φk , a function of k, and summing over all k gives 1 Ω
∑ φk k
∂ fk F 1 + vk · ∇r f k + · ∇k f k = − ∂t h¯ Ω
∂ρφ (r, t) 1 + ∂t Ω
F
∑ φk ( k
1
f k − f k0 ) =⇒ τ0
∑ φk (vk · ∇r f k + h¯ · ∇k f k ) = − Ω ∑ φk ( k
k
f k − f k0 ), τ0 (21.78)
1 where the density ρφ (r, t) = Ω ∑k φk f k (r, t) per volume Ω is the spatial variation of the physical property φk at time t. For simplicity we assume a k-independent τ0 (more general solutions can be sought). Choosing φk successively as different physical quantities generates various conservation laws, which are now described (See Table 21.1).
63 Here we use the physical argument dk that ∑k ∇k · ( dk dt f k ) = ∑k dt · (∇k f k ) + dk dk ∑k f k (∇k · dt ) = ∑k dt · (∇k f k ) = 0, since there is no net outflow or ”divergence” of particles in the k-space from the Brillouin zone in intraband transport.
64 Since there is no net outflow of energy
in the k-space from the Brillouin zone, we have ∑k ∇k · ( Ek dk dt f k ) = 0, which gives ∑k Ek dk dt · (∇k f k ) = E · J where E is the electric fields. This relation is used to derive the law of energy conservation for the electron system.
1 Conservation of particles: For φk = 1, ρφ (r, t) = Ω ∑k f k (r, t) = n(r, t) is the particle density. Because the number of particles is unchanged ∑k f k = ∑k f k0 , the RHS of Equation 21.78 vanishes. The equation then becomes the particle conservation law ∂n ∂t + ∇r · j = 0, 1 where j = Ω ∑ vk f k is the particle current density, and the equation is of current continuity63 . The electron charge current is J = −qj.
Conservation of electron energy: Choosing φk = Ek as the electron energy eigenvalue from the known bandstructure, in the scattering f −f0
term on the RHS R = ∑k Ek ( k τ0 k ) is the net rate of energy loss by the electron population to the lattice due to scattering. Denoting the 1 electron energy density by ue = Ω ∑k f k Ek , the energy current density 1 by Jue = Ω ∑k Ek vk f k , Equation 21.78 becomes64 ∂ul ∂ue + ∇r · Jul = R, + ∇r · Jue = E · J − R, and ∂t ∂t
(21.79)
which are the energy conservation relations for the electron system and the lattice system. E · J is the generation term, since it is the energy absorbed by the electron system from the electric field, and R is the energy recombination or loss term for electrons: it is delivered to
21.11 Conservation laws 519
the lattice by collisions, producing another conservation relation for the lattice energy density ul . Conservation of entropy and total energy: The relation for entropy generation was already discussed in Equation 21.50, where several of the current derivations from the moments of the BTE were also encountered from a thermodynamic viewpoint. We can now get the bigger picture from the BTE as to how the total energy in the transport process is conserved. The two Equations in 21.79 account for the energy stored in the electron system ue and the lattice ul . To complete the story we need to account also for the energy density u EM in the electromagnetic field that drives the current. The conservation law is directly obtained from Maxwell’s equations by writing ∂ [D · E {z + H · B}] + ∇r · [E × H] = −E · J, | {z } ∂t | u EM
(21.80)
J EM
where D is the displacement vector and H the magnetic induction, which define the standard energy per unit volume u EM in the electromagnetic field. The energy current density of the EM field J EM is the Poynting vector. The RHS is the energy lost by the EM field to the electrons in the lattice. Adding the LHS and RHS of Equations 21.79 and 21.80 gives the total conservation relation ∂ ( ue + u L + u EM ) + ∇r · (Jue + Ju L + JuEM ) = 0, |{z} |{z} ∂t |{z} electron
lattice
(21.81)
EM Field
which explicitly shows the division of energy into the electron, lattice, and the EM systems, and ensures that in the transport process the total energy is always conserved. Conservation of momentum: Let us now derive a rather important and widely applicable result using the moment of φk = vk , the group velocity of state k in the electron bandstructure. Multiplying the BTE of f k with the corresponding vk with the same k, dividing by the ∂f f −f volume, using ( ∂tk )coll. = − kτ0 0 , and summing over all k, we get 1 Ω
∑ vk k
=⇒
∂f
k
∂t
+ vk · ∇r · f k +
∂J J q + = ∂t τ0 h¯ Ω
F 1 · ∇k f k = − h¯ Ω
q ∑ vk (F · ∇k f k ) + Ω k
∑ vk k
f −f 0 k τ0
∑ v k ( v k · ∇ r f k ), k
(21.82) −q
where J = Ω ∑k f k vk is the electron current density and F = −q(E + vk × B) is the Lorentz force. Steps described in Exercise 21.15 transforms this moment equation of the BTE to a rather important form for
520 No Turning Back: The Boltzmann Transport Equation
transport in response to forces oscillating in time: ∂J J nq2 q qD + = ? E − ? (J × B) + ∇r n, ∂t τ0 m m τ0
(21.83)
where the macroscopic charge current density J is related to the applied electric field E, the magnetic field B, and any existing gradients in the carrier density ∇r n. This result is valid for a parabolic bandstructure of effective mass m? and density n, and D is the diffusion constant. This generalized dynamical form of the drift-diffusion equation helps explain a host of electronic, optical, and magnetic phenomena in semiconductors, and is especially useful for sinusoidal forces. We briefly discuss this time-dependent formulation of the BTE. Several features of transport phenomena in semiconductors may be immediately recognized from Equation 21.83. Consider the steady state reached with a DC electric field E, at no magnetic field and no connq2 τ centration gradients. Then, the drift current J = m? 0 E = σE, where qτ σ = qnµ is the conductivity, and µ = m? is the mobility. If there were no electric or magnetic fields, J = qD ∇r n is the diffusion current. In the presence of a DC magnetic field B with zero electric field and no carrier density gradients, an oscillating current may be qB identified by writing (iω + τ10 ) J = − m? J, identifying the cyclotron qB
frequency ωc = m? . If B = 0 and the electric field is oscillating in time, of the AC form E = E0 eiωt such as due to an electromagnq2 netic wave, then (iω + 1/τ0 )J0 eiωt = m? E0 eiωt gives the AC current (nq2 τ /m? )
65 We continue this discussion and use
the dynamic solution of the BTE in Chapters 25 and beyond.
66 The boldface Ω is the k-dependent k
Berry curvature vector, not to be confused by the scalar symbol Ω sometimes used for a volume. 67 Such situations occur for graphene, topological insulators, magnetic semiconductors, and also when spin-orbit interaction is considered in traditional semiconductors. The phase–space vol1 d∆V ume evolves in time as ∆V = dt − dtd ln[1 + hq¯ B · Ω] for non-zero Berry curvature materials when there is a nonzero magnetic field.
0 J(ω ) = 1+iωτ E0 . This is identical to the Drude form that was used 0 back in Exercise 2.5 to explain the reflectivity of metals65 .
21.12 Berry curvature correction Before discussing the limitations of the applicability of the BTE as formulated and discussed in this chapter in the next section, we point out a fundamental change that needs to be introduced in crystals that have non-zero Berry curvatures66 Ωk in their bandstructure. From Equation 21.1 the velocity of an electron wavepacket is v(k) = 1h¯ ∇k En (k ) − Fh¯ × Ωk . It turns out that the phase–space volume ∆V = ∆r∆k in transport is not conserved for non-zero Ωk , violating the Liouville theorem67 . ∆V The phase space volume changes to ∆V = (1 + hq¯ B · Ω). It may seem (0) that the whole framework of BTE discussed in this chapter may not apply for such materials. This matter is resolved by simply re-defining the density of states in the following prescription: q
∑(...) → ∑(1 + h¯ B · Ωk ) × (...), k
(21.84)
k
which rescales the phase space so that the results of the BTE remain valid as long as the extra factor that includes the Berry curvature and
21.13
Limitations of the BTE 521
magnetic field are included in all summations to evaluate transport coefficients.
21.13 Limitations of the BTE The BTE for electrons in a band is derived from the Liouville equation which requires that the electron number be conserved in the band. Thus, physical processes that move electrons into, or out of bands are not captured in the standard BTE. These situations, which are encountered when electron populations in bulk semiconductors are subjected to strong excitations that pull their distribution in the phase space very far from equilibrium, are difficult to describe by the BTE without significant extensions. These extensions require interband generation recombination currents, and coupled BTEs for each band involved in the process. Table 21.2 show the DC electric and magnetic fields, and AC electric fields (optical excitation) that can cause interband transitions in semiconductors of bandgap Eg . The strong perturbation conditions are reached easily in narrow bandgap semiconductors. In materials that have band overlaps, such as semimetals, or those that have band degeneracies such as Dirac cones in graphene, special care must be taken and the BTE must be extended accordingly to account for the special situation that low-field inelastic scattering (or for semimetals, even elastic scattering!) can move electrons between different bands. Semiconductor devices for which the BTE approach either fails (or is unnecessary) to explain the simplest experimental phenomena serve to illustrate its limitations. The first example where the BTE is unnecessary is (surprisingly) the earliest semiconductor device: the Schottky diode. As was described in Chapter 18, Section 18.1, the Schottky diode is essentially a ballistic device. Therefore, for a Schottky diode a transport model that neglects scattering captures the device behavior better than those that account for scattering. As opposed to Bethe’s ballistic model described in Section 18.1, models developed by Mott and by Schottky accounted for scattering and used a driftdiffusion approach for electrons moving from the semiconductor to the metal. These models also reproduce the J = J0 (eqV/kb T − 1) behavior, and are useful in high-voltage Schottky diodes with long depletion region thicknesses much larger than the mean free path of electrons xd >> λm f p . For Schottky diodes that have xd ≤ λm f p , the splitting of the quasi-Fermi levels at the semiconductor surface is EFs − EFm = qV, which is much closer to the ballistic model than those that need scattering and a BTE treatment. As opposed to the Schottky diode where the BTE can be used (but is unnecessary), there are semiconductor devices where it fails to explain the basic operation. An example of such a device is the resonant tunneling diode (RTD) discussed in Chapter 24. The reason the BTE is inapplicable in that situation is because from the get-go, the BTE solves for the probability density f ( x, k, t) of the electrons in the phase space.
Table 21.2 Physical conditions outside the regime of standard Boltzmann Transport Equation solutions. Here Eg is the bandgap, EF the Fermi level, and a the lattice constant. These cases are discussed in later chapters on tunneling, magnetotransport, and optical properties. Condition
Limit
Zener tunneling:
E2g EF
DC field Magnetic tunneling:
qFa >
DC Magnetic Field Optical excitation: AC field (E0 eiωt )
h¯ ( m? ) >
qB
h¯ ω > Eg
E2g EF
522 No Turning Back: The Boltzmann Transport Equation
68 Quantum mechanical extensions of
the BTE which solve for the wavefunction amplitude, such as the Wigner formalism can explain the behavior of the RTD. The non-equilibrium Green’s function (NEGF) approach based on the Landauer picture of transport starts from a fully ballistic model, to which adding scattering and dephasing which becomes increasingly difficult numerically. Capturing all details of the RTD physics at room temperature, remains a stringent testing ground for the validity and power of theories and models for quantum transport.
69 The most vivid experimental manifes-
tation of this is the Franz–Keldysh effect which is a photon-assisted interband tunneling process, discussed in Chapter 27, Section 27.11.
70 This introduces quasi-periodic oscilla-
tions in the density of states. The periodicity in DOS appears because of the requirement that the orbital periodicity of electron wavefunctions also satisfy an additional periodicity introduced by cyclotron orbits due to the magnetic field.
The probability density, analogous to the intensity of an electromagnetic wave, is proportional to the absolute square of the wavefunction amplitude f ∝ |ψ|2 . Because the phase differences between the electron wavefunctions are lost in the BTE, it cannot capture the physics that is born out of interference effects caused by phase differences. In a RTD, a negative differential resistance (NDR) is observed when there are two potential barriers in the path of an electron. Electron wave interference allows resonant states to transmit through the barrier with essentially unity transmission probability. Because this transmission is due to constructive interference of the phase of the electron wavefunction, the BTE is unable to capture its physics68 . Transport processes in which band electrons are captured by deep level defects and then re-emitted is not modeled well by the BTE. A recombination/generation model is better suited for such purposes. A model due to Mott, called the variable range hopping is more popularly used for situations where several electrons are trapped in deep defects and hop from one defect state to another. Exercise 21.14 shows how this model accounts for transport phenomena in the presence of severe disorder, which occurs in some semiconductors of technological relevance. If the external driving forces that affect transport also strongly modify the underlying density of states of the semiconductor, then the BTE is no more applicable. The first such case occurs for high electric fields that result in interband tunneling of electrons in semiconductors. The finite spatial overlap of the valence band and conduction band wavefunctions then leads to an oscillatory joint density of states69 . The interband tunneling current flow therefore is outside the scope of ordinary BTE because the number of carriers in a band are not conserved, violating the requirement of conserved particle number in the Liouville theorem. Possible extension of the BTE for coupled multiband transport can partially address this problem. Chapter 24 is devoted to the discussion of tunneling transport in semiconductors. The second case where the BTE fails is for magnetotransport phenomena at very high magnetic fields, which is the subject of Chapter 25. A magnetic field changes the density of states of a solid by the formation of characteristic bunching into Landau levels70 . The separation qB of the Landau levels is h¯ ωc , where ωc = m? is the cyclotron frequency. At small magnetic fields when h¯ ωc k b T or h¯ ωc >> τh¯ , quantum oscillations such as the Shubnikov–de-Haas oscillations and the quantum Hall effect are readily observed in semiconductors. For such transport phenomena, the BTE is inadequate, and a density matrix approach based on the Kubo formalism is necessary.
21.14
Chapter summary section 523
We discuss this fascinating topic in Chapter 25. The BTE is therefore most powerful in near-equilibrium transport phenomena that do not contain effects of electron phase interference. It is the preferred method to explain and compute low-field transport coefficients such as mobilities and diffusion coefficients. Most importantly, its power lies in accounting for scattering, and for small perturbations of electric field, magnetic field, heat, or light with equal ease. Furthermore, the BTE is the best method to show that deep down, these perturbations are interrelated, via the Onsager relations.
21.14 Chapter summary section This is one of the longest chapters of the book! We went deeper into irreversible thermodynamics in this chapter, and learned:
• How the Boltzmann transport equation (BTE) is obtained in the single-particle approximation of the Liouville equation. • That by letting go of the knowledge of every electron and treating the transport problem statistically, the BTE makes it possible to discuss the effect of scattering of large number of electrons. • The BTE shows that the equilibrium distribution of electrons is the one that has the highest entropy. • The collision term of the BTE accounts for all microscopic scattering events via Fermi’s golden rule. • The relaxation-time approximation (RTA) is a linear solution to the BTE that allows for analytical solutions. • Charge currents J and heat currents Jq flow in response to electric fields E, magnetic fields B, and temperature gradients ∇r T. • The kinetic coefficients that relate the currents to the forces are not independent: they are related by the Onsager relations.
Further reading Richard Tolman’s masterpiece The Principles of Statistical Mechanics provides a rigorous treatment of the Boltzmann H-theorem and irreversibility in classical and quantum mechanics. The Boltzmann Equation and Its Applications by Carlo Cercignani gives a comprehensive discussion of the Boltzmann equation and its many uses. In Transport Phenomena, Smith and Jensen shine the spotlight on the power and versatility of the Boltzmann transport equation by showing its use in understanding trans-
port phenomena of gases, metals, semiconductors, insulators, and superconductors in a unified manner. Various transport coefficients and the role of entropy is discussed in illuminating ways in Electronic Conduction in Solids by Smith, Janak, and Adler. Wolfe, Holonyak, and Stillman’s Physical Properties of Semiconductors, and Lundstrom’s Fundamentals of Carrier Transport provide in-depth discussion of the Boltzmann transport equation specifically for semiconductors.
524 Exercises
Exercises (21.1) Why Bloch oscillations are difficult (a) Show that the time required for a Bloch oscilla100 ps h tion is t B = qFa = (a in 0.4nm)·( F in kV/cm ) . (b) To make the time comparable to the sub-ps scattering processes, one can increase the electric field. Estimate the field at which the time for Bloch oscillations is reduced to 0.1 ps. (c) What problems would one run into at these high fields that would prevent Bloch oscillations for the single electron? (21.2) The Boltzmann trick To prove his rather famous H-theorem, Boltzmann used an ingenious physical argument to show that the expression for the rate of change of entropy of a spontaneous process must be mathematically always positive, implying that entropy increases, which is his proof of the second law of thermodynamics. In the step going from Equation 21.19 to Equation 21.20, show that exchanging k → k1 and the other indices sucessively should leave all of the physics of the electron-electron scattering unaltered, since they are dummy variables. Thereby, prove how Equation Equation 21.20 is obtained from Equation 21.19. (21.3) The arrow of time
Fig. 21.27 A box with partition illustrates the relation of time’s arrow and Boltzmann’s H-theorem with fluctuations and statistical nature rooted in large numbers.
It is sometimes said that the second law of thermodynamics and direction of change of entropy defines an ”arrow” of time. Fig. 21.27 shows three
snapshots in time at t = t1 , t2 , and t3 of a partitioned box in which there are gas molecules (or it could be an electron gas system in a solid with a split-gate partition). (a) For the upper row for large N, what is the timeordering of t1 , t2 , t3 ? If your answer is t1 < t2 < t3 , then you have implicitly accepted the second law of thermodynamics. The states of the system have defined the direction of time, or time’s arrow. (b) What is the ordering of time for the lower row with small N? (c) Discuss and comment on the statistical nature of the second law and time’s arrow. At what N is there a crossover? (d) Note that for the first row, in addition to large N, the state on the left is highly improbable, meaning it is a far from equilibrium low entropy state. Since the system cannot spontaneously end up there, someone must have put in the work to create this state. Argue then that time’s arrow and Boltzmann’s H-theorem also requires a low entropy initial state in addition to the large N. (e) Argue why as semiconductor devices move increasingly into the nanoscale, a reconsideration of several cherished laws of thermodynamics (that require large N) becomes inevitable. (21.4) The Jarzynski and Crooks equalities For spontaneous processes, the second law of thermodynamics relates the entropy S to work W and free energy F by an inequality since the Boltzmann H-theorem proves that entropy must increase. Jarzynski and Crooks recently have discovered remarkable equalities for nonequilibrium thermodynamics that assume special importance for fluctuations that are observed in small systems. (a) The Jarzynski equality between the work W and free energy change ∆F in a cyclic process is −
W
−
∆F
he kb T i = e kb T , where the brackets indicate an average performed over several cycles of an identical process, and thus includes the fluctuations in each
Exercises 525 cycle. Read about the equality, and note that its derivation uses the Liouville theorem. Comment on its importance in small systems. ρ (W )
W −∆F
(b) The Crooks equality is ρ F (W ) = e kb T , where R the ρ F (W ) indicates the probability density of W, the work done in the forward path and ρ R (W ) is the same for the reverse path. The spreads in ρ(W ) indicate the fluctuations in the forward and reverse processes. Read and appreciate this relation, and discuss its importance in nonequilibrium processes occurring in the nanoscale. (21.5) Putting the genie back in the bottle Newtons’ laws of classical mechanics and Schrodinger’s equation of quantum mechanics do ¨ not distinguish between processes running forward or backwards in time. In other words, these laws cannot describe why the molecules of perfume that escape from a vial into a room can never get back into the vial. This is explained by thermodynamics, which is not contained in Newton’s and Schrodinger’s equations. Describe ¨ how thermodynamics is introduced into classical mechanics (resulting in Maxwell–Boltzmann distributions) and into quantum mechanics (resulting in Fermi–Dirac and Bose–Einstein distributions), and how this maintains the irreversibility of certain processes. (21.6) Legacy of the BTE: 150 years of heated debates, beautiful physics and mathematics Famously, Poincare proved a recurrence theorem which shows that the gas escaping from a bottle actually will return into the bottle, but after an astronomically long time, much longer than the age of the universe! The H-theorem has a rich legacy: it introduced a quantitative formulation of entropy. This in turn led Planck to realize that since the entropy in the presence of a continuous energy spectrum of photons would become infinite, the only resolution was to assume a discrete, or quanta of photon energies. This realization gave birth to quantum mechanics. The mathematics underlying the H-theorem treatment of the BTE has been a rich source of ideas for partial differential equations, leading to several fundamental breakthroughs. The latest and most famous is the proof of the century old Poincare conjecture by G. Perelman in 2006, who among other things used a concept (similar to Boltzmann’s H-entropy) of the W-entropy of solitons flowing along Ricci fields. This is ironic given Poincare’s recurrence theorem’s role against Boltz-
mann’s H-theorem! Study this chain of events and write a short essay based on the connections created by the BTE. (21.7) The Maxwell–Boltzmann distribution from the BTE As an example of the original result derived by Boltzmann from the BTE, using the entropy function for classical particles in Equation 4.5, derive the Maxwell–Boltzmann distribution for particles with mass M and velocities v using the same strategy that was used to derive the Fermi–Dirac distribution in Section 21.5. The result is f (v) = − mv
3
2
m ( 2πk ) 2 · 4πv2 · e 2kb T , which is proportional to bT 2 − A · v2 . Sketch the distribution in velocities at v e several temperatures.
(21.8) Bose–Einstein Distribution from the BTE As another example of a rather remarkable result derivable from the BTE, using the entropy function for bosons in Equation 4.4, derive the Bose–Einstein distribution using the same strategy that was used to derive the Fermi–Dirac distribution in Section 21.5. (21.9) The electro-magneto-thermal vector G ∂f0 The generalized solution f k = f k0 − ∂Ek φk k of the BTE in the simultaneous presence of electric and magnetic field and thermal gradient is given by φk = −τk (vk · G), where G is the electro-thermo-magnetic vector G = (qτ )2
F+qτ (B×[ M ]−1 F)+ det[ M] (F·B)[ M ]B (qτ )2
1+ det[ M] (B·[ M]B) ( E−µ) T ∇r T is the
, where F = ∇r (µ −
qϕ) + electro-thermal force, B is the magnetic field, and [ M ] is the effective mass tensor. In this problem we derive this from the BTE for scalar effective masses m? , and then investigate some properties of tensor effective masses. (a) The collision term in the RHS of the BTE in the ( f − f 0) ∂f RTA is ∂tk |coll. = − kτk k . The steady state drift terms in the LHS of the BTE with external field Fext = −q(E + v × B) is Fh¯ext · ∇k f k + vk · ∇r f k . ∂f0
Show by using f k = f k0 − ∂Ek φk , substituting the k electro-thermal force F and the external force Fext , and for a small change in f k , the BTE becomes ∂f0
∂f0
q
∂f0
φ
−( ∂Ekk )vk · F + ( ∂Ekk ) h¯ (vk × B) · ∇k φk = ( ∂Ekk ) τkk . ∂f0
(b) Cancelling ( ∂Ek ) and rearranging, the vector k qτ differential equation φk = −τk (vk · F) + h¯ k (vk × B) · (∇k φk ) needs to be solved for φk . Because the solution for B = 0 is φk = −τk (vk · F), try a solu-
526 Exercises tion of the form φk = −τk (vk · G), where G should be a vector composed of both B and F. Show that for a scalar effective mass, the form in Equation F+µ(B×F)+µ2 (F·B)B 21.52 is indeed obtained: G = . 1+ µ2 | B |2
sponse to an electric field E is the same as in equilibrium, when there was no field. Explain why this is the case physically by considering electrons moving along and against the electric field, and the fact that the energy Ek is even in k.
Solution: For a scalar effective mass ∇k = mh¯ ? ∇v (21.11) Thermoelectric coefficients in d-dimensions gives ∇k (vk · G) = mh¯ ? ∇v (v · G), where the sub(a) Using the fundamental kinetic coefficients of script k is suppressed. In the resulting identity Equation 21.55, show that Seebeck coefficient ∇v (v · G) = (G · ∇v )v + (v · ∇v )G + v × (∇v × for a parabolic bandstructure for transport in G) + G × (∇v × v), the last three terms are zero d-dimensions with a scattering time τk is S = d +1 since ∇v × v = 0 and G does not depend physR∞ 2 τu du u 2 u− η 0 ically on v. Only the first term survives, which cosh [ 2 ] µ kb Ec − q [−η + R ], where η = kb T = EFk− . d bT upon writing out the components becomes simply ∞ u 2 τu du 0 T 2 [ u−η ] cosh (G · ∇v )v = [ G1 , G2 , G3 ][∂v1 , ∂v2 , ∂v3 ] [v1 , v2 , v2 ] = 2 [ G1 , G2 , G3 ] = G. Substituting φk = −τk (vk · G), Hint: Use u = Ek /k b T, the fact that the DOS for (d−2)/2 the vector differential equation is reduced to d-dimensions goes as Ek , and notice that in the vector algebraic equation vk · G = vk · F + the ( E − µ ) term in the numerator, the µ is a conk qτk ? stant for which the integrals in the numerator and m? ( vk × B ) · G. Using µ = qτk /m and upon using the triple product property (vk × B) · G = the denominator cancel. vk · (B × G), the group velocity vk cancels, giving G = F + µ(B × G). This equation is solved for (b) The Seebeck coefficient depends on how the G by taking a dot product with B on both sides scattering time τk depends on state k, or equivato give G · B = F · B since B · B × G = 0, and lently the state’s energy Ek . In Chapters 22 and 23 then taking the cross product on the left with B it is shown that for several scattering mechanisms to give B × G = B × F + µB × (B × G) = B × F + the scattering time can be expressed as a power µ[(B · G)B − |B|2 G] = B × F + µ[(F · B)B − |B|2 G], law of Ek in the form: τk = τ0 ( kEkT )r = τ0 ur . Using b where the dot product result from above is used. this in part (a), show that the Seebeck coefficient 2 The equation G = F + µ(B × F) + µ [(F · B)B − simplifies to the forms given in Equation 21.73. |B|2 G] thus becomes linear, and upon rearrange2 F+µ(B×F)+µ (F·B)B . ment immediately gives G = (c) Evaluate the Seebeck coefficient for various di1+ µ2 | B |2 mensions d and parameters r and show that they (c) Consider the changes above for the ellipsoidal follow what is shown in Fig. 21.19. Argue why conduction band of silicon say in the (100) direcin the non-degenerate situation, |S| ∼ kqb |η | is an 2 k2y k2z h¯ 2 k x excellent approximation for the magnitude. tion: Ek = 2 ( m + m + m ), which is anisotropic. l
t
t
Show that the inverse effective mass tensor [ M]−1 (21.12) Generalized d-dimensional Lorenz number of silicon is diagonal, and its determinant is Show that the generalized d-dimensional Lorenz 1 = m 1m2 . Use this in the expressions and det[ M] number using only the electronic part of the l t find the electro-magneto-thermal vector G for conthermal conductivity for parabolic bands is F d +r +1 ( η ) duction electrons in silicon. Discuss the relative L = ( kqb )2 [( d+2 2 + r )( d+2 4 + r ) F 2 − ( d+2 2 + strengths of the three terms: the conductivity, the d +r −1 ( η ) 2 Hall, and the magnetoresistance. F d +r ( η ) r )2 ( F 2 (η ) )2 ], and that this reduces to Equation d +r −1 2 (d) Show that for tensor effective masses, the gener21.24 in the non-degenerate and degenerate limits. ij alized kinetic coefficient can be expressed as Krs = v v τ r ( E −µ)s ∂f0 q (21.13) Integration by parts 1 · ( m )r−1 · ∑k (− ∂Ekk ) i q2j τk2 k where the efLd Show by using integration by parts that writ1+ det[ M] (B·[ M]B) ing u = Ek /k b T and η = µ/k b T and the fective mass tensor matrix is [ M], and det[ M ] is its 1 Fermi–Dirac distribution as f (u) = exp[u− , the determinant. η ]+1 R ∞ ∂ f (u) following identity holds: 0 (− ∂u ) G (u)du = (21.10) No change in net energy of drifting electrons R∞ dG (u) ∞ Show that in the relaxation-time approximation, 0 du f ( u ) du − [ f ( u ) G ( u )]0 . If G (0) = 0, then the net internal energy of electrons drifting in rethe product f (u) G (u) vanishes at u = 0 and at u =
Exercises 527
R ∞ ∂ f (u) R∞ dG (u) ∞, leaving 0 (− ∂u ) G (u)du = 0 du f (u) du , which maps to a sum of Fermi–Dirac integrals if dG (u) du is a polynomial in u. This trick is very useful in the evaluation of the kinetic coefficients of Equation 21.55. (21.14) Mott variable range hopping conductivity Defects in heavily disordered semiconductors introduce electron energy eigenvalues deep in the gap, far from the bands. An example is shown in Fig. 21.28. In such semiconductors, electrons are captured by the deep levels and are stuck there, occasionally to be re-emitted by thermal excitation. When an electric field is applied, the bands bend, and it becomes possible for an electron to hop from one deep level site to another. Such form of transport is called variable-range hopping, and was first explained by Mott. The electrical conductivity because of such a process comes out to be of the form T0
1 d +1
σ = σ0 e−( T ) in d-spatial dimensions. In this problem, you derive this result.
(b) Argue why the transition probability from site D 2 A2 W W i → j is Si→ j = c 05 4 exp [− 2R λ − k b T ]. Here Dc πρ0 s h¯ is the deformation potential (see Chapters 22 and 23), A0 is an exchange integral, W ≈ k b TD = h¯ vs q D is the barrier height surmountable by phonon assisted tunneling where TD is the Debye temperature and q D the Debye wavevector, R is the distance between the localized sites, ρ0 is the mass density of the semiconductor, vs is the sound velocity, and T is the temperature. (c) For d = 3, show that the extremum of W oc1 curs at a radial distance R¯ = ( 8πk9λTg ) 4 and at the b 3 ¯ = energy W 3 , leading to the variable range 4π R¯ g
T0
1
5/2 conductivity of σ = σ0 e−( T ) 4 , where T0 = 9πλ 3k g . b Here g = g( EF ) is the DOS at the Fermi level.
(d) Consider the distance between localized states to be R, the attempt frequency for tunneling to be νph to phonon frequency, tunneling decay length λ. Write the expression for the current density in terms of these parameters. Solution: The current density is written in analogy to the formulation J = qnv. Identifying the terms, we get
Fig. 21.28 Electron transport by hopping. The concept of variable range hopping was introduced by Mott.
(a) The current density J = σF, where F is the electric field, and σ is the conductivity. The conductivity is conventionally thought of in the drift-picture as σ = qnµ, where µ = qτ/m? is the electron mobility. Instead of the band picture of the transport of electrons of effective mass m? with a mean scattering time of τ, describe why the situation in the case of Fig. 21.28 is different. The electrons are localized and may not access the bands states: they may tunnel directly from one localized state to another. The localized wavefunctions vary as ψ ∼ e−r/λ where λ is the spatial spread of the localized wavefunction and their populations for barrier height W go as the Boltzmann weight e−W/kb T . The hopping conductivity should be proportional to the product of these exponentials.
J ≈ q · g( E)k b T · Rνph e−
2R λ
· [e
≈ 2q · g( E)k b T · Rνph e
−∆E+qFR kb T
∆E − 2R λ − kb T
−e
] qFR · sinh[ ] kb T
≈ 2q2 · g( E) · R2 νph e 2
−∆E−qFR kb T
∆E − 2R λ −k T
2
=⇒ σ = J/F ≈ 2q g( E)νph R e
b
·F
∆E − 2R λ − kb T
. (21.85)
(e) Mott (Fig. 14.3) argued that instead of an electron tunneling to a nearest neighbor energy state that would be high in energy, the electron could tunnel further in space to states closer to it in energy. The number of such states available per unit energy interval are 43 πR3 · g( EF ), which implies that the energy separation is ∆E = 4πR33g( E ) . F Show that this leads directly to the Mott variable range formula for hopping conductivity.
528 Exercises Solution: The maximum is reached for R such that d 2R 3 [− − ] = 0, dR λ 4πR3 g( EF )k b T 9λ 5/2 =⇒ R40 = and T0 = 8πg( EF )k b T 9πk b g( EF )λ3 s λg( EF ) −(T0 /T ) 41 2 σ = 3νph q · ·e . 2πk b T (21.86) (f) Using similar arguments as above, now show that in d-dimensions, Mott’s variable range hopping conductivity is given by the relation σ =
J =
nq2 hτ i m? E,
where hτ i =
2 d
·
R
d
∂ f (E )
dE ·τ (E )E 2 (− 0∂E ) R d dE ·E 2 −1 f 0 (E )
, where the integration variable E = E (k) is the kinetic energy of carriers. You have now at your disposal the most general form of conductivity and mobility from the Boltzmann equation for semiconductors that have a parabolic bandstructure! Hint: You may need the result that the volume of a ddimensional sphere in the k-space is Vd =
d
π 2 kd , Γ( d2 +1)
and some more dimensional and Γ-function information from Chapter 4.
1
T0 d +1 σ0 e−( T ) , with σ0 and T0 specific to each dimension.
(21.15) Obtaining the dynamical form of the BTE Provide the steps to arrive at the useful dynamical form of the BTE in Equation 21.83 from Equation 21.82. The two terms on the RHS of Equaq q tion 21.82 are h¯ Ω ∑k vk (F · ∇k f k ) + Ω ∑k vk (vk · ∇r f k ), where F = −q(E + vk × B) is the Lorentz q force on the electron. Show that h¯ Ω ∑k vk (F · 2 nq q q ∇k f k ) = m? F − m? (J × B) and Ω ∑k vk (vk · qD ∇r f k ) = + τ0 ∇r n, which results in the form given in Equation 21.83. (21.16) Monte Carlo and path integrals The Monte Carlo technique is used for solving the BTE, specially for understanding transport at high electric fields. Write down a flowchart that explains how this stochastic method is applied to evaluate the response of an ensemble of electrons to external fields. Also investigate the interesting origin of the Monte Carlo algorithm: which features mathematicians Stanislaw Ulam and John von Neumann, gambling, and the building of the atomic bomb. (21.17) Boltzmann transport: scattering and mobility This problem provides a preview of transport properties that will be the subject of Chapter 23. (a) We derived the solution to the Boltzmann transport equation in the relaxation-time approximation for elastic scattering events to be f (k) ≈ ∂ f (k) f 0 (k) + τ (k)(− ∂E0(k) )vk · F, where all symbols have their usual meanings. Use this to show that for transport in d dimensions in response to a constant electric field E, in a semiconductor with an isotropic effective mass m? , the current density is
Fig. 21.29 Electron mobility in a doped semiconductor at high temperatures is limited by phonon scattering, and by impurity and defect scattering at low temperatures. In this problem, you show that the ionized-impurity scatter3 ing limited mobility goes as T 2 /ND .
(b) Scattering from uncorrelated events: Show using Fermi’s golden rule that if the scattering rate of electrons in a band of a semiconductor due to the presence of one scatterer of potential W (r ) centered at the origin is S(k → k0 ) = 2π 0 2 h¯ |hk |W (r )| k i| δ ( Ek − Ek0 ), then the scattering rate due to Ns scatterers distributed randomly and uncorrelated in 3D space is Ns · S(k → k0 ). In other words, the scattering rate increases linearly with the number of uncorrelated scatterers, which implies that the mobility limited by such scattering will decrease as 1/Ns . This argument is subtle, and effects of electron wave interference should enter your analysis. Hint: Add the potentials of each randomly distributed impurity for the total potential Wtot (r ) = ∑i W (r − Ri ). Use the effective mass equation for the electron states to show that the matrix element is a Fourier transform. Then
Exercises 529 invoke the shifting property of Fourier transforms. (c) Impurity scattering: Using Fermi’s golden rule, calculate the scattering rate for electrons due to a screened Coulombic charged impurity potential r Ze2 − L D V (r ) = − 4πe e , where Ze is the charge of sr the impurity, es is the dielectric constant of the q
es k b T semiconductor, and L D = is the Debye ne2 screening length and n is the free carrier density. This is the scattering rate for just one impurity. Show using the result in parts (a) and (b), with a 1 − cos θ angular factor for mobility that if the charged-impurity density is ND , the mobility for 7
3
2 2 (4πes )2 (k b T ) 2 √ 3 π 2 Z2 e3 m? ND F ( β)
3 T2
3D carriers is µ I = ∼ ND . Here q ? 2m (3k b T ) β=2 L D is a dimensionless parameter, 2 h¯
β2 ]
β2 1+ β2
and F ( β) = ln[1 + − is a weakly varying function. This famous result is named after Brooks and Herring who derived it first. Estimate the ionized impurity scattering limited mobility at T = 300 K for m? = 0.2m0 , es = 10e0 , Z = 1, and ND ∼ 1017 , 1018 , 1019 /cm3 . Are your values close to what is experimentally observed for these conditions as shown in Fig. 21.29? (21.18) We now explain the complete temperature dependence of the electron mobility in some (not all!) doped 3D semiconductors. Fig. 21.29 shows the experimental result: at low temperatures, the electron mobility increases with temperature as 3 µ( T ) ∼ T 2 /ND , and at high temperature it decreases with temperature as µ( T ) ∼ 1/T 3/2 . We first connect the mobility to the scattering times qhτ i via the Drude-like result µ = m? where you found c how to calculate the ensemble averaged scattering time hτ i in part (a). (d) Phonon scattering: The scattering rate of electrons due to acoustic phonons in semiconductors is given by Fermis golden rule result for time1 dependent oscillating perturbations τ (k→ = k0 ) 2π 0 2 h¯ |hk |W ( r )| k i| δ ( Ek
− Ek0 ± h¯ ωq ), where the acoustic phonon dispersion for low energy (or long wavelength) is ωq ∼ vs q with vs the sound velocity, and the scattering potential is W (r) = Dc ∇r · u(r). Here Dc is the deformation potential (units: eV), ˆ 0 eiq·r is the spatial part of the and u(r) = nu phonon displacement wave, nˆ is the unit vec-
tor in the direction of atomic vibration, and the phonon wavevector q points in the direction of the phonon wave propagation. We also justified why the amplitude of vibration u0 may be found from h¯ ωq
2Mωq2 u20 ≈ Nph × h¯ ωq , where Nph = 1/[e kb T − 1] is the Bose-number of phonons, and the mass of a unit cell of volume Ω is M = ρΩ, where ρ is the mass density (units: kg.m−3 ). Show that a transverse acoustic (TA) phonon does not scatter electrons, but longitudinal acoustic (LA) phonons do. (e) By evaluating the scattering rate using Fermi’s golden rule, and using the ensemble averaging of part (a), show that the electron mobility in three dimensions due to LA phonon scattering √ is µ LA =
2 2π 3
useful result.
q¯h4 ρv2s
5 (m?c ) 2
3
3
Dc2 (k b T ) 2
∼ T − 2 . This is a very
(f) Now combine your work from all parts of this problem to explain the experimental dependence of mobility vs. temperature and as a function of impurity density as seen in Fig. 21.29. The density-matrix formalism for transport The density matrix formalism of transport is based on the partition function which for a quantum system takes the form of the density matrix ρˆ =
1 exp[− β Hˆ ], Z
(21.87)
∂ρˆ ˆ ρˆ ], = [ H, ∂t
(21.88)
whose time evolution is given by i¯h
where [ A, B] = AB − BA is the commutator. The Kubo formalism of transport is based on this fully quantum mechanical approach. The system Hamiltonian is split between the unperturbed equilibrium state Hamiltonian Hˆ 0 and a perturbation: ˆ Study this formalism and write a Hˆ = Hˆ 0 + W. short essay on how this method can be reduced to several approximations, one of which is the Boltzmann approximation. A distinction can also be made between open and closed quantum systems: for an open system in which particle number or energy can vary, the time evolution includes gen∂ρˆ eration/recombination terms of the kind i¯h ∂t = ˆ ρˆ ] ± Γρˆ where Γ has units of energy. [ H,
Taking the Heat: Phonons and Electron-Phonon Interactions The periodic arrangement of atoms in a semiconductor crystal and the chemical bonds connecting them is the atomic-scale realization of a classical mass-spring network (or an electrical LC network in the form of a transmission line). Collisions of an external mass with a massspring network excites vibrational waves by exchanging momentum and energy. In much the same way, sound waves, electrons, or photons moving inside a crystal can exchange momentum and energy strongly with the crystal by either losing energy to, or absorbing energy from, the lattice vibrations. The vibrations are quantized: a quantum of vibrational energy is called a phonon (from ”phono” or sound). Because phonons and their interactions with electrons and photons are integral to the physical properties of semiconductor materials and devices, we devote this chapter to learn:
22 22.1 Phonon effects: A r´esum´e
531
22.2 Phonon dispersions and DOS 533 22.3 Optical conductivity
538
22.4 Lyddane–Sachs–Teller 540
equation
22.5 Acoustic wave devices
541
22.6 Thermal conductivity
542
22.7 Phonon number quantization 543 22.8 Electron-phonon interaction
544
22.9 Chapter summary section
550
Further reading
551
Exercises
551
• What are the effects of phonons on the properties of solids in general, and semiconductors in particular? • What are the allowed modes, energy dispersions, density of states, and quantum statistics of phonons in semiconductors? • How do phonons influence the optical, electronic, and thermal conductivity of semiconductors? • Which device applications of semiconductors are rooted in the properties of phonons?
22.1 Phonon effects: A r´esum´e In late 1890s a curious property of crystalline insulators called the Reststrahlen band was observed. When broadband infrared light was reflected off of the surface of the crystal, a narrow band of wavelengths reflected with near unity coefficient, as if there was a ”silver coating” only for this band. The specific wavelength range was characteristic of the crystal. Multiple reflections purified the infrared spectra further; the technique of using the Reststrahlen (”residual rays”) was instrumental in the development of quantum mechanics itself (Fig. 22.1). As
Fig. 22.1 Around 1900, Heinrich Rubens (above) discovered the Reststrahlen effect in the optical reflectivity of crystals working with Ernest Nichols. Using this Rubens constructed the Reststrahlen infrared spectrometer, with which he accurately measured the blackbody radiation spectrum. He shared his data with Max Planck, and the next day Planck had obtained his famous radiation formula which launched quantum mechanics.
532 Taking the Heat: Phonons and Electron-Phonon Interactions
1 Experimentally it was known well be-
fore the advent of quantum mechanics that insulating solids with light atoms and strong chemical bonds (e.g. diamond, SiC) are excellent conductors of sound, heat, and light (e.g. optically transparent), and poor conductors of electricity. They of course violated the classical Wiedemann–Franz law because of their low electrical conductivity. We have seen that the reason is not because they have fewer electrons per se, but because of energy bandgaps and fewer free electrons near the Fermi level.
Fig. 22.2 Peter Debye made important contributions to several areas of physics such as X-ray scattering and the heat capacity of solids. The Einstein and the Debye models of heat capacity are the earliest quantum theories of solids. Debye was awarded the 1936 Nobel Prize in Chemistry.
2 Theoretical models today can accu-
rately calculate the experimentally measured phonon dispersions of bulk solids, and are developed enough to guide the design of phonon spectra in multilayer structures. Models for the calculation of the phonon dispersions of solids range from semi-classical mass-spring kinds (Born) through those based on stretching and bending of actual chemical bonds (Keating), to the modern first principles DFT methods with increasing computational cost and accuracy.
we will see in this and later chapters, this narrow band of frequencies lies between the transverse optical (TO) and longitudinal optical (LO) branches of the phonon spectrum of a crystal1 . An unresolved mystery in early 1900s was the temperature dependence of the heat capacity of solids, as we discussed in Chapter 4, specifically in Exercises 4.7, 4.8, and 4.9. By direct application of the classical laws of thermodynamics with the assumption of the presence of atoms, the Dulong–Petit law explained why the heat capacity per unit volume of a 3D solid is cv = 3nk B , where n is the atomic density and k B the Boltzmann constant. Experimentally this T −independence is violated dramatically at low temperatures, since cv → 0 as T → 0. The first explanation for this behavior was offered by Einstein in 1907. Drawing on the analogy with the quantization of photons, he hypothesized that lattice vibrations are quantized. His bold claim was that the same rule of quantization of energy that applied to oscillating electromagnetic waves in blackbody radiation also applied to matter waves in solids. His simplified model assumed that all phonons have the same energy ω0 , and that phonons follow the Bose–Einstein occupation function in energy. We now know that this is the correct model only for optical phonon modes whose non-zero energy is indeed nearly constant, independent of wavelength. Though Einstein’s model of the heat capacity of solids (the first application of quantum theory to solids) explained the vanishing of cv as T → 0 K, it was inconsistent with the quantitative temperature dependence, which went as cv ∼ T 3 . This was fixed by Debye (Fig. 22.2) in 1912, who hypothesized the existence of a range of phonon frequencies ω β starting from zero and increasing with the phonon wavevector β(= 2π/λ) as ω β = vs β, where vs is the velocity of sound in the solid. Debye thus first hypothesized the existence of acoustic phonon modes in a solid: his model remains highly successful in explaining thermal properties of solids because the low energy acoustic phonon modes are occupied much more than the higher energy optical phonon modes. The phonons therefore have a ”bandstructure” similar to that of electrons. They are measured experimentally by several complementary techniques. For very long wavelengths in the q → 0 limit, Raman spectroscopy directly measures the zone-center optical phonon energies, and Brillouin spectroscopy measures the acoustic modes. Neutron diffraction and inelastic X-ray scattering techniques are used to measure the phonon dispersion across the entire Brillouin zone2 . In semiconductors, under ordinary conditions, phonons influence carrier transport and optical phenomena to perturbative order. This means phonons do not change the electronic bandstructure, and only influence the transition rates between states via scattering, treated via Fermi’s golden rule. But they dominate the propagation of heat (thermal conductivity) and sound. Sound transport in semiconductors forms the basis of devices such as surface acoustic wave (SAW) and bulk acoustic wave (BAW) filters and resonators. Phonons strongly coupled to electrons form quasiparticles called polarons, and phonons
22.2
Phonon dispersions and DOS 533
Fig. 22.3 A comparison of phonon dispersion, density of states, and occupation functions with the corresponding quantities for electrons in semiconductor crystals. Phonon energies are limited to ∼ 10s of meV, whereas electronic band energies range over ∼ 10 eV: note the large difference in scale! As opposed to electron energies, phonon energies are bounded from zero to a maximum optical phonon energy. Because of linear dispersion at low energies, the acoustic phonon DOS increases as the square of the energy (see Exercise 5.3). The Reststrahlen band lies between the LO and TO modes. As opposed to the Fermi–Dirac occupation function of electrons which are fermions, phonon occupation follows the Bose–Einstein distribution because phonons are bosons. Whereas electron occupation function is limited to a maximum of 1 due to the Pauli exclusion principle, no such limitation exists for phonons.
strongly coupled to photons form polaritons. A dramatic effect of strong electron-phonon coupling is the formation of Cooper pairs and superconductivity, which drastically changes the electronic properties into a new phase. Just as we developed the electronic energy bandstructures of semiconductors to understand their electronic and photonic properties, we begin our exploration of the physics of phonons in semiconductors by first investigating their energy bandstructure.
22.2 Phonon dispersions and DOS Before describing the specific phonon dispersions of semiconductors and how they are measured or calculated, a direct comparison of their properties with those of electrons is both highly illustrative, and places the discussion in context. Fig. 22.3 shows them side by side. The properties of electrons on the left are discussed extensively earlier in this book. The corresponding properties of phonons are on the right. The figure label contrasts the properties, emphasizing the fact that phonons are bosons whose occupation function follows the Bose–Einstein statistics, and their energies are in the meV range for acoustic modes and in the 10s of meV range for optical modes. We now discuss how the phonon dispersion (or bandstructure) of a crystal is obtained, starting from the vibrational properties of molecules. Fig. 22.4 shows the energy of interaction between two atoms when they are a distance r apart. For large r there is a weak van-der Waals attraction, whereas for very small r there is strong repulsion due to Coulomb interaction and Pauli exclusion. The distance of lowest energy r = a is the equilibrium bond length. If the molecule is stretched from the equilibrium position r = a, the energy near the minimum
Fig. 22.4 Harmonic approximation. The interaction energy between atoms has a minimum, and near this minimum the energy is roughly quadratic in r. Small vibrations around the equilibrium position is modeled by a mass-spring system. The √ natural vibration frequency is ω = K/Mr , where 1/Mr = 1/M1 + 1/M2 is the reduced mass of the two masses connected by a spring constant K. The lowering of energy drives the formation of H2 or N2 molecules.
534 Taking the Heat: Phonons and Electron-Phonon Interactions
Fig. 22.5 Longitudinal acoustic (LA) phonons in a linear chain of atoms modeled as a mass-spring system. Note the adjacent atoms move in phase (i.e. in the same direction), a characteristic of acoustic phonons. Since the vibrations are along the x-axis which is the direction of the wave propagation, it is a longitudinal acoustic (LA) wave.
3 Note that IR spectroscopy also forms
the basis for gas detection and pollution control because the vibrational frequencies are like ”fingerprints” of the specific molecular species. Semiconductor photonics is used for both the sources and detectors of IR photons for such studies, as described in Chapters 28 and 29.
4 Note that though x is a continuous spa-
tial variable, because us ( x ) is the displacement of the sth atom, it is defined only at the values xs = sa0 where s = ... ± 1, 0, 1, ... and not for any other x. Going from the first to second line of Equation 22.1, u0 ei( βsa0 −ωt) cancels from both sides.
varies nearly quadratically with r − a, consistent with the harmonic oscillator approximation indicated in Fig. 22.4. Because the mass of the nuclei M are ∼ 103 × me heavier than electron mass me , the vibrational energies are ∼ mMe times smaller for the same momentum. Because the electronic energy levels are in the 10s of eV, this pegs the energy scale of vibrations at 10s of meV. These vibrational energies fall in the infrared (IR) regime of the spectrum, and are directly measured by IR spectroscopy3 . Because of the approximately parabolic potential, the energy levels are those of a quantum mechanical harmonic √ oscillator of energies En = (n + 12 )h¯ ω, where ω = 2K/M is the oscillation frequency, with K the spring constant and M the atomic mass. Lighter atoms and stronger chemical bonds lead to higher ω, but for a particular molecule, the frequency ω is fixed within the parabolic approximation. A higher quantized energy level n therefore implies a larger amplitude of oscillation at the same frequency. Instead of two atoms, Fig. 22.5 shows their 1D chain of equilibrium lattice constant a0 . Imagine moving one atom away from its equilibrium position along x and letting it go. Since the atom is coupled to all other masses through the spring-mass network, this sets off a wave of vibration of all masses to travel through the chain in the x-direction. The energy imparted is saved in the system in the potential energies of the stretched or squeezed springs, and the kinetic energy of the moving masses. This is an example of a collective excitation in which energy is not stored in any particular atom, but collectively by all. The harmonic approximation results in Hooke’s law force F = −K∆x on the atoms. If the displacement of the sth atom from equilibrium is written as a propagating wave us = u0 ei( βxs −ωt) , nearest neighbor interactions on the sth atom for xs = sa0 results in a restoring force4 d2 u s = K (us+1 − us ) + K (us−1 − us ), with us ( x ) = u0 ei( βsa0 −ωt) dt2 =⇒ − Mω 2 = K (eiβa0 − 1) + K (e−iβa0 − 1) r r 2K (1 − cos βa0 ) 4K βa0 K 2 ωβ = , and ω β = | sin | ≈ [ a0 ] β = vs β, M M 2 M | {z }
F=M
vs
(22.1)
22.2
Phonon dispersions and DOS 535
Fig. 22.6 Longitudinal Acoustic (LA) and transverse acoustic (TA) Phonons. The atoms vibrate parallel to the wave propagation direction ~us k ~β in the longitudinal modes and perpendicular ~us ⊥ ~β to the wave propagation direction in the transverse modes. Note that the adjacent atoms may be different.
which yields an explicit dispersion relation of the phonon modes. For this phonon dispersion,√the long-wavelength limit λ → ∞ and β = 2π/λ → 0 give vs = a0 K/M with the dimensions of a velocity. This is the speed of sound propagation through the mass-spring system, and the modes are called ”acoustic” phonons. For periodic boundary conditions imposed on the chain (imagining it as a part of a large circular ring) of length L, the allowed wavevectors are β n = n · 2π L where n is an integer, there are N = L/a phonon modes that define a Brillouin zone q − πa ≤ β ≤ + πa . The resulting
0 acoustic mode dispersion h¯ ω β = h¯ 4K M | sin 2 | inside the BZ, and the resulting acoustic phonon density of states g ph ( E) are shown in Fig. 22.3, with an analogy to the electron bandstructure and energy dispersion. For a typical sound velocity vs ∼ 105 cm/s in a semiconductor, a phonon mode of wavelength λ ∼ 1 µm has energy h¯ vs β ∼ 20 µeV. This energy is so small that even at room temperature of T = 300 K, h¯ vs β 1 atoms in the unit cell, additional optical modes are formed. These vibrational modes are shown in Fig. 22.8. The major difference from the acoustic modes is that the vibrations of nearest neighbor atoms are out of phase, and in the opposite directions. They may be visualized as the combination of in-phase acoustic modes vibration of an atom of the unit cell which is 180◦ out of phase with that of the other atom of the unit cell. Writing the displacement of the two atoms as x1 and x2 and repeating the same procedure as Equation 22.1 for the 2 atom basis, the reader is encouraged to show that the new phonon energy dispersion for crystals with multi-atom basis is s K 2Mr 2 ω± ( β ) = [1 ± 1 − (1 − cos βa0 )], (22.2) Mr M1 + M2
2.0
1.5
1.0
0.5
0.0 2.0
1.5
1.0
0.5
0.0
-1.0
- 0.5
0.0
0.5
1.0
Fig. 22.9 Representative acoustic and optical phonon branches and their dependence on the atom masses M1 and M2 . (a) Phonon dispersion for M1 = M2 . If the two atoms have the same mass, the optical and acoustic branches touch as the BZ edge. (b) Phonon dispersion with M1 6= M2 . Heavier masses lower the phonon energies. Though a gap appears in the phonon spectrum in this 1D example of different masses, such gaps in the phonon spectra may or may not exist in real crystals of higher dimensions. Additional transverse acoustic (TA) and transverse (TO) modes are not captured in this toy model.
where 1/Mr = 1/M1 + 1/M2 is the reduced mass. Solving this would require you to equate to zero a 2×2 determinant of the form | D ( β) − ω 2β δij | = 0. This toy model for phonons already has the ingredients for the calculation of accurate phonon dispersions. For the full phonon dispersion of solids, accuracy is increased by not limiting to nearest neighbor interactions, by solving a secular equation of a larger order dynamical vibration matrix with realistic ”spring constants” K obtained from first principles. We do not pursue this calculation here, but discuss the results of the toy model captured in Equation 22.2 as shown in Fig. 22.9, and compare the result with Fig. 22.10 which shows the experimental phonon dispersions of Si, GaAs, and GaN, which are verified by accurate calculations. Fig. 22.9 (a) shows that the ω+ branch is the optical mode, and the ω− branch is the acoustic mode which was discussed earlier. For semiconductors such as Ge, Si, or diamond, because the atoms in the basis have identical masses (M1 = M2 ), the branches touch at the Brillouin zone edge. Note the actual phonon dispersion of Si shown in Fig. 22.10 shows the LA and LO branches from the Γ → X points follow the same dispersion as Fig. 22.9 (a), including the fact that they touch at the X-point. On the other hand, when the atoms in the unit cell are different,
22.2
Phonon dispersions and DOS 537
Fig. 22.10 Phonon dispersions of GaAs, silicon, and GaN. The phonon energy h¯ ω β is shown on the left in wavenumber (cm−1 ) units, and meV units on the right. The wavevectors β span paths of high symmetry in the Brillouin zone of the respective crystal structures (see Figures 10.10 and 10.12 for the paths). Note that the phonon energies increase for light atoms, and the substantial gap in the spectrum of GaN between optical and acoustic modes. The dashed line cutting across is k b T = 26 meV at 300 K.
such as in GaAs, the LO and LA modes do not touch at the X-point as seen in both Fig. 22.9 (b) √ and Fig. 22.10 for GaAs. The maximum LA energy is proportional to K/M2 where M2 is the mass of the√heavier atom, and the LO energy at this wavevector is proportional to K/M1 , where M1 is the mass of the lighter atom. The optical phonon energies are dominated by the lighter atom due to their dependence on the reduced mass. Notice in Fig. 22.10 that the optical phonon energies increase substantially because at least one atom of the unit cell becomes lighter going from GaAs to Si to GaN, since the atoms are lighter. In the toy model, the optical phonon energy may be related roughly to √ the sound velocity via ωop ≈ 2 · (vs /a0 ) · ( M1 + M2 )/ M1 M2 . The long wavelength (β → 0) acoustic mode velocity is controlled by the sum of the masses, in which the heavy atom dominates. The sound velocity is higher for crystals made of lighter atoms and stronger chemical bonds. Compared to the LA phonon branches, the TA phonon dispersions of GaAs and Si start out and stay at lower energies both from Γ → ∆ → X and Γ → Λ → L. The TA modes of GaAs and Si reach zero speed (or group velocity v g = ∂ω β /∂β ) at the X and L points. The number of atoms per unit cell for GaAs and Si is N = 2, and that for GaN is N = 4. From Fig. 22.10, it is instructive to verify that the semiconductors indeed have 3N phonon branches, of which 3 are acoustic and 3N − 3 are optical. The entire phonon spectra for all wavevectors β of most semiconductors can be measured by inelastic neutron spectroscopy (see Fig. 22.11). More recently, inelastic X-ray scattering has also been used for this purpose. The long-wavelength phonon energies near β ≈ 0 however, can be measured in table-top systems by studying the reflection of light of much lower energy than X-rays. As discussed in Section 22.1, this was already achieved around 1900 before neutrons were discovered by studying the interaction of light with crystals, which is the topic of discussion in the next section.
Fig. 22.11 B. Brockhouse (above) developed neutron scattering techniques to study solids with C. Shull, for which they shared the 1994 Nobel Prize in Physics. Brockhouse and P. K. Iyengar in 1958 measured the phonon dispersion of germanium. Nuclear reactors form the source of neutrons for such spectroscopy.
538 Taking the Heat: Phonons and Electron-Phonon Interactions
Fig. 22.12 Interaction of the TO phonon mode of a polar crystal with light in the form of a transverse electromagnetic wave.
20
10
-10
1.0
0.8
0.6
0.4
0.2
0.0
Fig. 22.13 For a polar semiconductor (e.g. GaN here), the frequency-dependent (a) dielectric constant er (ω ) and (b) optical reflectivity are strongly influenced by the TO and the LO phonon modes.
22.3 Optical conductivity The electric field component of light E0 ei(k· x−ωt) interacts strongly with the transverse optical (TO) modes of polar semiconductors when the frequency of light matches that of the optical phonon ω ∼ ωTO . A simple reason is seen in Fig. 22.12. the polar crystal has atoms that have net alternating charge of opposite signs. Assuming light to be a transverse electromagnetic wave, the force due to the electric field of light pushes the positively and negatively charged atoms apart, generating a TO mode only if light is of the same wavelength as the TO phonon mode. The interaction of light with the polar semiconductor crystal is characterized quantitatively by a frequency dependent relative dielectric constant er (ω ) shown in Fig. 22.13 (a), and a frequency-dependent reflectivity shown in Fig. 22.13 (b). We will now obtain the quantitative relationships using the classical Drude–Lorentz oscillator model. Let us assume that the net charge on the atom of mass M+ is +q? and of mass M− is −q? . Let us also assume that the displacements of the atoms from their equilibrium position are x+ and x− respectively. We 2 to can then use Newton’s law with a damping rate γ and K = MωTO get for the two masses d2 x + dx+ 2 = − M+ ωTO ( x + − x − ) − M+ γ dt dt2 d2 x − dx − 2 M− 2 = − M− ωTO ( x − − x + ) − M− γ dt dt d2 ( x + − x − ) d( x+ − x− ) 2 =⇒ + ωTO ( x+ − x− ) + γ = dt dt2 M+
+ q? E(t) − q? E(t) q? E ( t ), Mr
(22.3)
and upon writing x+ − x− = x0 ei( βx−ωβ t) and E(t) = E0 ei(k· x−ωt) , it is seen that the modes only interact when the wavelength of light is matched to that of the phonon (k = β). This cancels the spatial dependence on both sides. Retaining the time dependence, we get 5 The damping, or ”friction” term γ rep-
resents processes that scatter light, similar to the momentum scattering rate for electrons in the Drude model in Chapter 2. For example it could be dislocations in a crystal.
2 − ω 2 x0 + ωTO x0 − iγx0 =
q? q? E0 /Mr E0 =⇒ x0 = 2 (22.4) Mr ωTO − ω 2 − iγω
as the amplitude of the relative vibration as a function of the electric field and the frequency of the lightwave5 . Because p = q? x0 is a microscopic dipole and there are N of them per unit volume, the electronic
22.3 Optical conductivity 539
polarization that the external electric field of light produces is P = Nq? x0 =
N ( q ? )2 1 · 2 E0 . Mr ωTO − ω 2 − iγω
D = er e0 E = e0 E + χe0 E + Pext =⇒ N ( q ? )2 1 er ( ω ) = 1 + χ + · 2 e0 Mr ωTO − ω 2 − iγω
(22.5)
(22.6)
which is an expression for the frequency-dependent relative dielectric constant, where χ is the electric suspectibility. Note that er (ω = 0) = N ( q ? )2 1 + χ + e M ω2 = edc is the low-frequency (dc) relative dielectric con0
r
TO
stant, and er (ω → ∞) = 1 + χ = e∞ is the high-frequency relative N ( q ? )2 dielectric constant. This implies edc − e∞ = e M ω2 , with which the 0
r
TO
frequency dependence of the relative dielectric constant is re-written Table 22.1 Relation between real and imaginary parts of dielectric constants and refracin terms of experimentally measurable quantities as tive indices and the optical reflectivity. er (ω ) = e∞ + (edc − e∞ )
2 ωTO = e1 + ie2 . 2 − ω 2 − iγω ωTO
(22.7)
Fig. 22.13 (a) shows an example of er (ω ) for GaN with edc = 8.9, e∞ = 5.35 and no damping (γ = 0). As the frequency of the light wave approaches the TO phonon frequency, the strong resonance sharply increases the relative dielectric constant from its DC value of edc . er (ω ) then is negative, and passes through zero at what will be seen in the next section to be the LO phonon frequency ω LO , and increases towards e∞ at frequencies ω higher than ω LO , but h¯ ω lower than interband transition energies. The entire frequency dependence of the optical response of semiconductors that include interactions in addition to phonons will be discussed in Chapter 27. The real and imaginary parts of the relative dielectric constant (e1 , e2 ) are linked to the real and imaginary parts of the refractive index (η, κ ) of the solid via the wellknown relations given in Table 22.1, from which the optical reflectivity R(ω ) is obtained using Equation 22.7 from experimentally known parameters. Fig. 22.13 (b) shows the calculated reflectivity corresponding to Fig. 22.13 (a) with a small nonzero γ. The semiconductor reflects IR light of frequencies between the TO and LO frequencies with R ≈ 1: this is the Reststrahlen band discussed earlier. The reflectivity therefore constitutes a direct measurement of the TO and the LO phonon frequencies of a semiconductor. The deviation from R ≈ 1 is a measure of the damping parameter γ. The photonic bands are obtained by solving ω = √ c k with er (ω )
η + iκ =
√
e1 + ie2
e1 =r η 2 − κ 2 and e2 = 2ηκ r q q e12 +e22 +e1 2
η=
and κ =
e12 +e22 −e1 2
( η −1)2 +κ 2 ( η +1)2 +κ 2
R=
5
4
3
2
1
0 0
1
2
3
4
er ( ω )
from Equation 22.7. Of the four roots of ω obtained, two are positive. These branches are shown in Fig. 22.14 as a function of the light wavevector k. They are the allowed photon dispersion inside the semiconductor crystal, modified by the interaction with optical phonons.
Fig. 22.14 Phonon-polariton modes for the propagation of light in the polar semiconductor with the simplified dielectric properties similar to GaN.
540 Taking the Heat: Phonons and Electron-Phonon Interactions
The strong interaction between photons and phonons splits the ordinary light dispersion into a upper and a lower branch. The modes near the intersection of the photon and phonon dispersions have the properties of both light and matter. These quasiparticles, called phononpolaritons, have an energy gap in the Reststrahlen window.
22.4 Lyddane–Sachs–Teller equation The relative dielectric constant vanishes at the LO phonon frequency, er (ω LO ) = 0. An interesting relation between phonon frequencies and dielectric constants is obtained from Equation 22.7 (with γ = 0): ω 2LO e = dc , 2 e∞ ωTO
Fig. 22.15 Raman spectroscopy by inelastic scattering of light from crystals directly measures the Raman-active LO and TO modes. This method is used extensively in chemical identification and as a sensitive measurement of strain in semiconductor materials and heterostructures.
Fig. 22.16 C. V. Raman (above) with his student K. Krishnan discovered inelastic scattering of light as the visible wavelength analogue of X-ray Compton scattering. Raman was awarded the Nobel Prize in Physics in 1930.
(22.8)
which is called the Lyddane–Sachs–Teller equation after the researchers who derived it. Since light has very small k, it interacts with the phonon modes near β ≈ 0. Since for polar compound semiconductors ω LO > ωTO , edc is larger than e∞ . But for non-polar elemental semiconductors such as Si, Ge, or diamond, ω LO = ωTO for β ≈ 0, and therefore edc = e∞ in such semiconductors. They do not produce Reststrahlen bands, since they are not IR active, as is discussed briefly. A remarkable consequence for the propagation of both light and electrons occurs as a result of the above considerations. Consider the requirement that in the absence of net unbalanced charge, the divergence of the displacement vector must vanish: ∇ · D = 0. Since D = e0 er E, this implies e0 er (ω )(∇ · E) = 0. Ordinarily, ∇ · E = 0 is ensured if ∇ · E = E0 ∇ · (ei(k·r−ωt) ) = k · E0 ei(k·r−ωt) = 0, which is true when E0 ⊥ k, i.e., when the electric field E0 is perpendicular, or transverse to the propagation direction in which the wavevector k points. This is the case for a TEM wave. But for ω = ω LO , a longitudinal EM wave with E0 k k is sustained in a polar semiconductor crystal due to the formation of macroscopic electric fields due to the oscillating charged planes of atoms. The IR reflectivity drops to zero at ω slightly larger than ω LO as seen in Fig. 22.13. We will discuss the implication on electron-phonon interaction in Section 22.8. When light is reflected off from a crystal, the photon can absorb or emit a phonon in the crystal. This inelastic scattering of light from crystals is called Raman scattering. It was discovered first for molecules, where light excites the molecular vibrational modes. The optical phonon vibrations of a semiconductor may be considered analogous to molecular vibrations, and similarly contain a ”fingerprint” of the atomic constituents and also its strain state. Fig. 22.15 shows the principle of Raman spectroscopy, and Fig. 22.16 shows its discoveror. For Raman spectroscopy of semiconductors, today a laser excitation with photon energy h¯ ω0 less than the semiconductor bandgap Eg is used. The reflected light has energies h¯ ω − h¯ ω phonon if a phonon is
22.5
emitted by the light (called the Stokes process), and h¯ ω + h¯ ω phonon if a phonon is absorbed (called the anti-Stokes process). Since a phonon can only be absorbed if it is already there, the intensity of the antiStokes line for Raman-active TO or LO lines is Iabs ∝ n ph , where n ph = 1/(eh¯ ω ph /kb T − 1) is the Bose–Einstein occupation function. Thus, the intensity of the anti-Stokes line decreases with temperature. On the other hand, the intensity of the Stokes line for phonon emission is Iem ∝ 1 + n ph , the sum of the spontaneous and stimulated emission processes. Thus, the intensity ratio is Iabs /Iem = eh¯ ω ph /kb T , and the emission lines are stronger6 . Raman spectroscopy involves three particles: two photons and one phonon, which makes it a weak process, requiring sensitive detectors. Fig. 22.17 shows experimental Raman spectra measured for single-crystal GaN and AlN.
Acoustic wave devices 541
6 Group theory is used to identify which
optical phonon modes are Raman active, and which are IR active. As an example, the TO modes of GaAs are both Raman and IR active. The TO modes of silicon are Raman active, but not IR active.
22.5 Acoustic wave devices The measurement of acoustic phonon modes by optical methods uses Brillouin scattering, which places even higher requirements on instrument sensitivity than Raman spectroscopy. On the other hand, electronic and microwave techniques are used to exploit sound wave propagation for devices whose importance in electronic communication systems is increasing rapidly. We discuss them in this section. Acoustic phonon modes are responsible for the propagation of sound in insulating crystals. Typical sound velocities range from vs ∼ 104 − 106 cm/s in crystals, with semiconductors of lighter atoms exhibiting p the highest velocities since vs ∼ a0 K/( M1 + M2 ) wherein the heavier mass dominates. By combining the toy model of phonons with the optical phonon energies, the longitudinal sound velocity near β ≈ 0 may be approximated roughly by v L ∼ a L ω LO , and the transverse sound velocity by v T = a T ωTO , where a L and a T are the respective lattice constants in the direction of wave propagation. We have already seen in Fig. 22.10 that the mode energies ω β and as a result group velocities ∂ω β /∂β of the transverse acoustic branches are nearly half of the longitudinal branches. As a result, sound is dominantly carried by the longitudinal modes. The longitudinal sound velocities are vs ∼ 5.2 × 105 cm/s in GaAs, ∼ 8 × 105 cm/s in GaN, ∼ 11 × 105 cm/s in AlN, and ∼ 18 × 105 cm/s in diamond: note the increase in velocity with reduction in the mass of the atoms. Unlike diamond, GaAs, GaN, and AlN are piezoelectric. The combination of high acoustic sound velocities with piezoelectricity makes these crystals attractive for electro-acoustic resonators that are used as filters in RF and microwave electronics. Such filters convert an electromagnetic wave of frequency f 0 into an acoustic wave of the same frequency. The wavelength shrinks in this process by the ratio of the sound velocity to the speed of light: vs /c ∼ 10−4 , which shrinks the filter size from several cm to microns, which is easily integrated on a chip. An incoming signal of frequency content
Fig. 22.17 Raman spectra measured in single-crystal GaN and AlN. Exercise 22.2 asks you to identify the optical phonon modes using Fig. 22.10, and to analyze the Lyddane–Sachs–Teller relation between the LO/TO phonon modes and the high/low frequency dielectric constants.
Fig. 22.18 Microwave filters commonly used in cellular phones use the bulk acoustic wave (BAW) resonator. AlN is widely used for BAW filters today because of the combination of desirable piezoelectric properties, high sound velocity for higher frequencies with the same size, and because it is compatible with fabrication methods. For example, f 0 ∼ 5.5 GHz for AlN of L = 1 µm. Section 22.8 discusses the physics of electromechanical coupling.
542 Taking the Heat: Phonons and Electron-Phonon Interactions 104
105
1000 104
100
1000 10
Fig. 22.19 Thermal conductivity of various semiconductors. (a) Thermal conductivity vs. temperature and with a comparison to metallic copper. (b) Room-temperature thermal conductivity as a function of the energy bandgap of various semiconductors, with those of typical insulators such as SiO2 and SiN.
100 1
10 1
10
100
300
1000
0.1 0
2
4
6
8
10
f 0 + ∆ f is allowed to propagate through an acoustic wave cavity that only allows the desired signal of frequency f 0 and rejects the others. Such cavities may be formed with acoustic waves propagating along the surface (called surface-acoustic wave or SAWs), or bulk-acoustic wave or BAW resonators. Fig. 22.18 shows a BAW filter. A BAW cavity of length L filters the signal of frequency f 0 ∼ vs /2L as the first mode for which half the acoustic wavelength fits in the cavity.
22.6 Thermal conductivity Phonons carry heat in semiconductors. In Chapter 2, we saw how in metals the fact that heat energy is carried by the same electrons that carry electrical current led to the Wiedemann-Franz law. This law is violated for semiconductors, because in them electrical current is carried by electrons (and holes), but heat is carried primarily by lattice vibrations, or phonons. Metals such as copper are thought to be very good thermal conductors. But some of the best thermal conductors at 300 K are actually wide-bandgap semiconductors such as diamond and cubic BAs. SiC, AlN, and hBN approach the 300 K thermal conductivity of copper. Figures 22.19 (a) and (b) show the thermal conductivity of various materials. The dependence of the thermal conductivity of semiconductors on temperature of Fig. 22.19 (a) has the following origin. Back in Chapter 2, Equation 2.7, we found that thermal conductivity is given by κ = 13 cv v2 τ where cv is the specific heat capacity, v is the velocity of the heat carrier, and τ is the scattering time of the heat carrier. Since phonons are the heat carriers in semiconductors, their specific heat
22.7 Phonon number quantization 543
capacity follows the Debye law, going as ∼ T 3 at the lowest temperatures, and saturating to the Dulong–Petit classical value of 32 nk b at temperatures higher than the Debye temperature Θ D , which corresponds roughly to the highest acoustic phonon energy. The velocity is the sound velocity for long-wavelength acoustic modes, which is independent of temperature. The product vτ is an effective mean-free path, which at the lowest temperatures becomes equal to the sample size, which is independent of temperature. Therefore, the thermal conductivity at the lowest temperatures in ultrapure semiconductors with low defect densities is limited by sizeeffect phonon scattering, and goes as κ ∼ T 3 as T → 0 as shown in Fig. 22.20. Deviations from the T 3 low-temperature dependence typically are indications of the presence of defects and impurities that scatter phonons acoustically. Thus, κ increases sharply as the temperature increases. As the temperature approaches the Debye temperature Θ D , phonon-phonon scattering, which was prohibited at the lowest temperatures by energy and momentum conservation requirements, becomes possible. This form of scattering is intrinsic as well, and arises from the anharmonicity of the crystal potential. The scattering rate by Fermi’s golden rule is proportional to the density of phonons 1/τph ∝ n ph ∝ 1/(eΘD /T − 1). Because at these temperatures cv is T-independent, and v is always T-independent, the thermal conductivity κ ∝ (eΘD /T − 1) of semiconductors at high temperatures exhibits a maximum as shown in Fig. 22.20 schematically7 . The corresponding experimental values at the highest temperatures drop as κ ∝ (eΘD /T − 1) ∝ 1/T. Exercise 22.4 outlines a more quantitative treatment of thermal conductivity and the effect of defects based on the BTE of Chapter 21.
Fig. 22.20 Temperature dependence of thermal conductivity of semiconductors, and schematic representation of 3phonon scattering processes due to the anharmonicity of the atomic potentials.
7 The
phonon-phonon scattering involves both fission and fusion processes because unlike electrons, their numbers need not be conserved. See Exercise 22.6 for a discussion why normal phonon processes are not enough, and Umklapp processes are needed to explain experimental thermal conductivities.
22.7 Phonon number quantization Let Nβ = 1/(eh¯ ωβ /kb T − 1) be the Bose–Einstein occupation function for the phonon mode indexed by the wavevector β and of energy h¯ ω β at temperature T. Consider a semiconductor of mass density ρ (e.g. in kg/m3 units) and the volume of the entire crystal is L3 . For extensive use in the next section and in Chapter 23 to evaluate the scattering of electrons by phonons, we will now show the important quantum mechanical result that the squared vibration amplitude due to phonon absorption and phonon emission are given respectively by8 h¯ Nβ h¯ ( Nβ + 1) 2 2 |u abs and |uem . β | = β | = 3 2ρL ω β 2ρL3 ω β
(22.9)
Each phonon mode is an independent harmonic oscillator. When phonons are in thermal equilibrium, the energy stored in the mode β of angular frequency ω β is Eβ = ( Nβ + 12 )h¯ ω β , which is the sum of the kinetic energy of the moving atoms and the potential energy of
8 We are drawing on a semi-classical ver-
sion of the proof. The rigorous proof is provided in quantum mechanics textbooks for the harmonic oscillator.
544 Taking the Heat: Phonons and Electron-Phonon Interactions
9 The squared amplitude for the phonon
absorption process is proportional to the number of phonons present Nβ (i.e. it is a stimulated process). But the squared amplitude for the emission process can be both stimulated and spontaneous, leading to the Nβ + 1 dependence. This semiclassical argument thus gives the average of the two squared amplitudes of the exact quantum result of Equation 22.9. 2.0
1.5
0.5
- 1.0
u = u β cos ω β t =⇒ KE(t) = 2ρL3 ω 2β u2β sin2 ω β t.
(22.10)
For an harmonic oscillator, hKEi = hPEi over a period. With hsin2 ωti = 1/2, the amplitude of vibrations of a given mode u β in that case is9 h¯ ( Nβ + 12 ) 1 hKEi + hPEi = 2ρL3 ω 2β u2β = ( Nβ + )h¯ ω β =⇒ u2β = . 2 2ρL3 ω β (22.11)
22.8 Electron-phonon interaction
1.0
0.0
the springs. The amplitude of vibrations of the standing wave and the corresponding kinetic energy stored in a unit volume are
- 0.5
0.0
0.5
1.0
0.04
0.03
0.02
In the preceding sections of this chapter we have discussed how the transport of light, heat, and sound through semiconductors is affected by phonons. This section is devoted to understand how phonons affect the transport of mobile electrons and holes. We saw that light either forms quasiparticles with phonons (e.g. phonon-polaritons) when the interaction is strong, or light is scattered by phonons when the interaction is weak (e.g. Raman or Brillouin scattering). Similarly, when the interaction of electrons with phonons is strong, they form quasiparticles such as polarons. Phonons simply scatter electrons when the interaction is weak. For most conventional semiconductors, the interaction is weak and the primary result of electron-phonon interactions is scattering. The strengths of the interactions are captured by writing the total energy of the electron-phonon system as
0.01
0.00 - 0.04
- 0.02
0.00
0.02
0.04
Fig. 22.21 A parabolic electron bandstructure E(k ) and the phonon bandstructure h¯ ωk plotted to highlight the points of intersection.
10 An extreme case of strong coupling of
electrons and phonons results in the phenomena of superconductivity.
H = Helectron + Hphonon + Helectron-phonon . | {z } | {z } | {z } E(k)
h¯ ω β
(22.12)
Wβ (r,t)
The first term is the electron energy in the presence of a perfectly still lattice (no phonons), which is the electron bandstructure E(k). The second term allows the motion of the nuclei of atoms of the crystal around their equilibrium positions, and is the sum of the kinetic energy of the moving atoms and the potential energy of the ”springs” between them: this is the phonon bandstructure h¯ ω β discussed in this chapter. The third term is the energy of interaction of electrons and phonons. Fig. 22.21 shows a parabolic electron bandstructure E(k) juxtaposed with the phonon bandstructure h¯ ω β plotted for the same phonon and electron wavelengths with β = k = 2π/λ. These have already been discussed earlier. There are two intersections: one of electrons with optical phonons at k op ∼ 0.1 × πa , about 1/10th of the BZ, and another with acoustic phonons at k ac ∼ 0.01 × πa at ∼ 1/100th of the BZ. Strong coupling may occur at these intersections leading to the formation of quasiparticles10 – which are eigenvalues for the entire electron-phonon Hamiltonian of Equation 22.12. Since in most semiconductors the third term is much smaller than the first two, we will
22.8
treat it as a perturbation Wβ (r, t) that depends on the electron wavevector k and phonon wavevector β. Typical energy scales are E(k) ∼ eV, h¯ ω β ∼ 1 − 10s of meV, and Wβ (r, t) ∼ 1 meV or much less. The rest of this section is devoted to finding Wβ (r, t) in semiconductors. Acoustic deformation potential (ADP): The displacement of an atom from its equilibrium position in a longitudinal acoustic (LA) wave is ~ u β (r) = βˆ L u β ei( β·r−ωβ t) , where βˆ L is the unit vector along the longitudinal direction of the wave propagation, along the wavevector ~β. The difference in displacement of atomic planes separated by a small distance dr is then u β (r + dr) − u β (r) ≈ (∇ · u β (r)) · dr. The spatial derivative or divergence11 of the displacement u β (r) is dimensionless, and is physically the ”compression” or ”dilation” of the crystal lattice, or the strain. As the longitudinal wave propagates through the crystal, atomic planes are pushed closer to each other in some regions, and pushed farther apart in others, shown in Fig. 22.22. Since smaller lattice constant semiconductors have larger bandgaps, the conduction band edge in compressively strained regions shift above their equilibrium values. Bardeen and Shockley (Fig. 22.23) modeled this in an intuitive manner by assuming that the conduction band edge energy shift is linearly proportional to the strain, giving a perturbation ~
WβADP (r, t) = ∆Ec (r ) = Da ∇ · u β (r) = iDa βu β ei( β·r−ωβ t) ,
(22.13)
where the proportionality constant Da is called the conduction band deformation potential. A typical conduction band-edge deformation potential is of the order of Da ∼ 10 eV, which are typical band widths in the electronic bandstructure. The strain is ∼ u β /a0 where a0 is the lattice constant and Equation 22.9 gives u β . Since strain is a tensor, so is the deformation potential. We restrict the discussion here to the simple isotropic cases, for which Da is a scalar. A valence band deformation potential is similarly defined. The total bandgap deformation potential is the sum of the conduction and valence band deformation potentials. It is measured in photoluminescence measurements as the shift of the semiconductor bandgap under externally applied strain. Conduction and valence band deformation potentials are indirectly accessible from transport measurements of ntype and p-type carriers respectively. It is straightforward to calculate the deformation potentials directly from the electronic bandstructure by evaluating the shift of band-edge energy eigenvalues by varying the lattice constant. This may be done most simply in the tight-binding bandstructure evaluations of Chapter 11, and is typically more frequently employed in the k · p bandstructure of Chapter 12. We will now express the spatial part of every phonon scattering potential in the generalized form ~
~
W+ (r, t) = K β u β · e+i β·r · e−iωβ t and W− (r, t) = K β u β · e−i β·r · e+iωβ t , (22.14)
Electron-phonon interaction 545
11 If all atoms are displaced equally, then
the whole crystal has moved, and there is no compression or dilation. There is compression or dilation of the crystal if adjacent displacements are unequal, and thus the derivative ∂u/∂x is representative of such strain.
Fig. 22.22 Acoustic deformation potential perturbation to the band edges is caused by the compression and dilation of the lattice.
Fig. 22.23 William Shockley was a pioneer in semiconductor physics and devices. He formulated some of the earliest theories to explain electron transport phenomena in semiconductors, and with Bardeen formulated the theory of deformation potential scattering. He explained the curious phenomenon of the saturation of current in semiconductors by invoking phonon scattering. He was awarded the Nobel Prize in Physics in 1956 with Bardeen and Brattain for the invention of the transistor, the device that launched the information age.
546 Taking the Heat: Phonons and Electron-Phonon Interactions
where + is for phonon absorption and − for phonon emission. Because W± has energy units, K β has units of energy per length. For LA phonon ADP scattering, we read off K β from Equation 22.13: K β = iDa β for ADP scattering.
12 Fig. 22.7 shows this vividly!
13 A
dimensionless electromechanical coupling coefficient K is defined via the 2
e2 /e
e2
relation 1−KK2 = pzρv2 dc = e pz 2 , which s dc ρvs is the ratio of the electrical energy per unit volume e2pz /edc to the mechanical energy per unit volume ρv2s . Typically K2 < 1 for semiconductors. A higher K2 is desired for the BAW and SAW devices that were discussed in Section 22.5.
(22.15)
~ For TA phonons, since u β (r) = βˆ T u β ei( β·r−ωβ t) , the direction of atomic vibrations βˆ T is perpendicular to ~β, the direction of the wave propagation. Thus the strain is ∇ · u β (r) = βˆ T · ~β × (...) = 0, and thus there is no coupling with electrons, and no scattering12 .
Piezoelectric potential (PZ): When LA phonons propagate in a polar crystal, the strain field ∇ · u β (r) creates alternating planes of positive and negative charges because of the piezoelectric effect. The linear piezoelectric coefficient of polar semiconductors is defined so that the net dipole moment per unit volume, or the polarization magnitude is given by P = e pz ∇ · u β (r). Though e pz in general is a tensor, we discuss the scalar case13 . The units of e pz are in charge per unit area. Because the net charge in the semiconductor is zero, Maxwell’s equations give, for the scalar case, ∂u β (r, t) e pz ∂u β (r, t) with D = 0 =⇒ E(r, t) = − , ∂r edc ∂r Z qe pz qe pz ~ WβPZ (r, t) = −q drE(r, t) = u β ei( β·r−ωt) =⇒ K β = for PZ. edc edc D = edc E + e pz
(22.16)
14 Material
properties relevant to electron-phonon interactions are listed for various semiconductors in Tables A.2 and A.3 in the Appendix.
15 ODP scattering of electrons for small k
in the Γ-valley of direct bandgap semiconductors is forbidden by symmetryconstrained selection rules. It is allowed if the band minima are not at the Γ-point.
There is no piezoelectric scattering in non-polar semiconductors such as silicon and germanium. The TA modes do not couple to electrons through the piezoelectric component for the same reason there is no ADP coupling of transverse modes14 . Optical deformation potential (ODP): Since the atomic planes vibrate out of phase in optical modes, the local strain is proportional, not to the differences in displacement captured by ∇ · u β , but to the nearest neigbor atomic displacement us − us+1 , which is the displacement vector u β (r) itself15 . The optical deformation potential Dop is thus defined as the proportionality constant ~
WβODP (r, t) = Dop u β ei( β·r−ωβ t) =⇒ K β = Dop for ODP scattering. (22.17) A typical Dop is of the order of bandgap over lattice constant ∼ Eg /a0 , ˚ or 108 eV/cm. which for most semiconductors is of the order of eV/A, Polar optical phonon (POP): In polar crystals, the optical modes produce effective charges q? as discussed in Section 22.3. There we N ( q ? )2 obtained the relation edc − e∞ = e M ω2 , with the mass density ρ = 0
r
TO
22.8
Electron-phonon interaction 547
Mr /Ω where Mr is the reduced mass of atoms in a unit cell of volume Ω, and N = 1/Ω. This yields the Born effective charge s q 1 1 ? q = ΩωTO ρ(edc − e∞ ) = Ωω LO e∞ ρ( − ). (22.18) e∞ edc The second form results from the first using the Lyddane–Sachs–Teller relation of Equation 22.8. The scattering potential is obtained in a manner similar to piezoelectric scattering. The microscopic charges have a dipole moment p = q? u β , which produce a net volume polarization P = q? u β /Ω. Using this in Maxwell’s equation yields q? u β q? u β with D = 0 =⇒ E(r, t) = − Ω e∞ Ω Z ? q q ~ =⇒ WβPOP (r, t) = −q drE(r, t) = · · u ei ( β·r−ω β t) iβe∞ Ω β s q · q? qω LO 1 1 =⇒ K β = = ρ( − ) for POP scattering.(22.19) iβe∞ iβ e∞ edc D = e∞ E +
¨ The polar optical phonon scattering potential is called the Frohlich potential (Fig. 22.24). Unlike the deformation potentials, this potential is obtained entirely in terms of known macroscopic physical properties of the semiconductor crystal. The polar optical phonon scattering potential therefore has no unknown parameters and is accurate. Geometry of electron-phonon scattering: Since we know the perturbation potentials due to phonons that are experienced by electrons, we are in a position to compute the scattering rates. If a phonon in mode ~β scatters an electron from state |ki → |k0 i, momentum conservation requires k0 = k + ~β for phonon absorption, and k0 = k − ~β for phonon emission, neglecting Umklapp processes. Energy conservation requires Ek0 = Ek + h¯ ω β for phonon absorption and Ek0 = Ek − h¯ ω β for phonon emission. The electron experiences phonon scattering po~ tentials W+ (r, t) = K β u β e+i β·r e−iωβ t for absorption and W− (r, t) = K β u β e−i β·r e+iωβ t for emission. Fermi’s golden rule immediately yields the absorption and emission rates as
Fig. 22.24 Herbert Frohlich performed ¨ seminal work on understanding how electrons interact with phonon modes in crystals. His work provided new insights to the quantum mechanics of transport phenomena in semiconductors as well as in understanding the origin of superconductivity.
Table 22.2 Phonon scattering matrix elements. Kβ
2 |K β |2 |u abs β |
ADP
iDa β
β2 Da2 h¯ Nβ 2ρL3 ω β
PZ
qe pz edc
ODP
Dop
2 h Dop ¯ Nβ 2ρL3 ω β
POP
qq? iβΩe∞
q2 ω 2LO h¯ Nβ 2β2 L3 ω β
~
1
2π ~ |hk0 |K β u β e+i β·r |ki|2 δ[ Ek0 − ( Ek + h¯ ω β )], and h¯ 1 2π ~ = |hk0 |K β u β e−i β·r |ki|2 δ[ Ek0 − ( Ek − h¯ ω β )]. em 0 τk ( β, k ) h¯
τkabs ( β, k0 )
=
(22.20)
We postpone the complete evaluation of these scattering rates to Chapter 23, where the effect of phonons on electron mobility will be discussed. The scattering potential obtained for the four scattering
q2 e2pz h¯ Nβ
2e2dc ρL3 ω β
( e1∞ −
1 edc
)
548 Taking the Heat: Phonons and Electron-Phonon Interactions
Fig. 22.25 Phonon scattering geometry for (a) elastic phonon emission and absorption, (b) inelastic phonon absorption, and (c) inelastic phonon emission. Table 22.3 Allowed phonon β ranges.
β max k β min k
β max k β min k
Acoustic Absorption 0 ? 2(1 + mh¯ kvs ) Optical Absorption q q
1+ 1+
h¯ ω0 Ek h¯ ω0 Ek
+1 −1
Emission ? 2(1 − mh¯ kvs ) 0 Emission q 1− q 1− 1− 1+
h¯ ω0 Ek h¯ ω0 Ek
modes: ADP, PZ, ODP, and POP are summarized in Table 22.2 and carried forward and expanded in Table 23.6 in Chapter 23. The squared matrix element involves |K β |2 |u β |2 . The quantum version of |u β |2 for absorption in Table 22.2 is obtained from Equation 22.9. A few things can be noted about the dependence of the squared matrix elements |K β |2 |u β |2 on the phonon wavevector β. For acoustic phonons of energy h¯ ω β = h¯ vs β > 1, resulting in a β2 factor in the denominator. For ADP scattering, this cancels the β2 dependence in the numerator, making the squared matrix element independent of the phonon wavelength. But for PZ scattering, the matrix element is much larger for longer wavelength phonons as it goes as 1/β2 . For optical phonons, it may be assumed that ω β ≈ ω0 , in which case Nβ is independent of β. Then, the squared matrix element is independent of β for ODP scattering, but for POP scattering it goes as 1/β2 , similar to PZ scattering. This is expected because unlike ADP and ODP scattering which originate from deformation, both PZ and POP scattering originate from charges whose Coulomb forces have a long range. Long-range forces typically strongly favor scattering of electrons by small angles, as we describe next. For the complete evaluation of the scattering rates and mobility in Chapter 23, the last ingredient required is an accurate description of the geometry enforced in the electron-phonon scattering events to satisfy momentum and energy conservation relations, which is discussed here. Fig. 22.25 shows a phonon in mode ~β scattering an electron from state |ki into a state |k0 i for (a) elastic scattering, which is valid for low-energy acoustic phonons, and (b, c) for inelastic scattering, which is valid for optical phonons. As seen before, momentum conservation requires k0 = k + ~β for phonon absorption, and k0 = k − ~β for phonon emission, neglecting Umklapp processes. Energy conservation requires Ek0 = Ek + h¯ ω β for phonon absorption and Ek0 = Ek − h¯ ω β for phonon emission. Consider first the geometry of elastic scattering shown in Fig. 22.25 (a). The initial and final electron wavevectors (k, k0 ) lie on a constant energy sphere for 3D transport (angles for lower dimensions are obtained in a similar fashion). The angle between the phonon wavevector ~β and the initial electron wavevector k is α, which is related to θ, the angle by which the electron is scattered, by α = (π + θ )/2. So when θ : 0 → π, α : π2 → π. It is clear that if the acoustic phonon energy is neglected compared to the electron energy, the magnitude of the phonon wavevectors are between 0 ≤ β ≤ 2k. This is an excellent approximation for acoustic phonons. We will find the exact range shortly, but we can say more about the nature of scattering before doing that. Since the squared matrix element for ADP scattering in Table 22.2 is independent of β, all 0 ≤ β ≤ 2k have the same scattering strength. Thus the electron is scattered over all angles 0 ≤ θ ≤ π with equal preference: this is isotropic scattering. On the other hand, for acoustic PZ phonon scattering, the squared matrix element goes as 1/β2 , which is
22.8
Electron-phonon interaction 549
3
2
1
-1 -2 -3 -1.0
- 0.5
0.0
0.5
1.0
-1.0
- 0.5
0.0
0.5
1.0
-1.0
- 0.5
0.0
0.5
1.0
-1.0
- 0.5
0.0
0.5
1.0
large for small β, implying that small α and small θ are heavily favored and the scattering is not isotropic. We use the convention that upper signs are for absorption and lower signs are for emission. For a parabolic electron energy bandstructure Ek = (h¯ k )2 /(2m? ), equating the momentum and energy before and after scattering with the angles of Fig. 22.25 (a, b, c), we get
Ek0
k0 = k ± ~β =⇒ (k0 )2 = k2 + β2 ± 2βk cos α h¯ ω β β β = Ek ± h¯ ω β =⇒ ( )2 ± 2 cos α ∓ = 0. k k Ek
(22.21)
For acoustic phonons, h¯ ω β = h¯ vs β, and we get β m? vs vs = 2(∓ cos α ± ) = 2(∓ cos α ± ), k h¯ k vk where vk = h¯ k/m? is the electron group velocity. For a generic optical phonon dispersion ω β , we get s h¯ ω β β = ∓ cos α ± cos2 α ± . k Ek
(22.22)
Fig. 22.27 A plot of Equation 22.23 highlighting constraints on emission and absorption of phonons by electrons. The panels show successively increasing electron kinetic energy Ek compared to the phonon energy h¯ ω β . For Ek < h¯ ω β , no emission is allowed.
16 From Fig. 22.21 (b), there is a min-
imum energy required for electrons to emit acoustic phonons too. Though acoustic phonon scattering is considered nearly elastic, an electron cannot emit acoustic phonons unless it is moving faster than the speed of sound vs in the semiconductor, which is of the order of 105 cm/s. Such energy and angle restrictions have direct analogy to Cherenkov radiation, which occurs when electrons move through media faster than the phase velocity of light (see Fig. 22.26).
(22.23)
Fig. 22.27 shows that while absorption is allowed with all angles α, an electron may emit a phonon only if it has an energy Ek more than the optical phonon energy h¯ ω β . This restricts the allowed angles. As the kinetic energy Ek of the electron increases well beyond the optical phonon energy, the scattering event starts appearing more like the elastic version where all angles are allowed16 . Equations 22.22 and 22.23 with the bounds −1 ≤ cos α ≤ +1 and the restriction of real β set the exact ranges of the magnitude of β/k allowed in phonon scattering. Consider the geometry of electron scattering from longitudinal optical (LO) phonons. A characteristic wavevector k LO is defined via the relation (h¯ 2 k2LO )/(2m? ) = h¯ ω LO . For phonon absorption shown in Fig. 22.25 (b), the electron energy difference leads to (k0 )2 = k2 + k2LO . The angle α can take all values between (0, π ), but
Fig. 22.26 Cherenkov observed that electrons moving though dielectric media faster than the velocity of light in that medium emitted a radiation, or glow – which is named after him, and for which he was awarded the 1958 Nobel Prize in Physics. The acoustic analogs are the boom of a supersonic jet, and the emission of acoustic phonons by electrons.
550 Taking the Heat: Phonons and Electron-Phonon Interactions 0 the from β abs min = k − k qphonon wavevectors range in magnitude q 0 2 2 k[ 1 + (k2LO /k2 ) − 1] to β abs max = k + k = k [ 1 + ( k LO /k ) + 1], can be read off from the figure. For phonon emission shown 0 Fig. 22.25 (c), (k0 )2 = k2 − k2LO , which leads to βem min q q = k−k
= as in =
0 k[1 − 1 − (k2LO /k2 )] and βem 1 − (k2LO /k2 )]. max = k + k = k [1 + Clearly Ek ≥ h¯ ω LO , or equivalently k ≥ k LO for an electron to be able to emit an optical phonon. The tangent from the tip of the vector k to the smaller circle defines a maximum angle θ, which implies that α ranges from π to a maximum value given by cos αmax = k LO /k. Table 22.3 shows the resulting ranges for β for both acoustic and optical phonon scattering. We will return to these considerations in Chapter 23, where we will evaluate phonon scattering limited electron mobility.
22.9 Chapter summary section In this chapter, we learned that:
• The effect of phonons in crystals on the transport properties of light and thermal properties of heat, were instrumental in the development of quantum mechanics. • Phonon modes are lattice vibrations that have energies less than 10 − 100 meV for most semiconductors. Phonons are bosons. Their occupation function follows the Bose–Einstein distribution function. • Adjacent atomic planes can vibrate in-phase, or out of phase, leading to acoustic and optical phonons respectively. For N atoms in the primitive cell, are 3N phonon branches of which 3 are acoustic branches, and 3N − 3 are optical modes. • Acoustic phonons have a zero energy mode with the dispersion roughly h¯ ω β = h¯ vs β where vs is the sound velocity. Optical modes have small energy dispersion and are sometimes approximated to be dispersionless (ω β ≈ ω0 ). • Photons interact strongly with the TO phonon modes. The optical reflectivity at frequencies between the TO and LO energies is nearly unity. This is knowns as the Reststrahlen band, and is a direct measurement of the two energies. • Photons can absorb or emit quanta of phonons upon reflection from a semiconductor. This inelastic scattering of photons by phonons is called Raman scattering for optical modes, and Brillouin scattering for acoustic modes. Both techniques are used extensively to identify (or ”fingerprint”) semiconductors, and to also probe their strain or stress states. • Inelastic neutron scattering, or X-ray scattering is used to measure the entire phonon dispersion experimentally.
Further reading 551
• Acoustic wave propagation is the principle behind SAW and BAW filters used in RF and microwave electronics. • The thermal conductivity of semiconductors is controlled primarily by phonons. Light atoms semiconductors with few atoms in the unit cell have high thermal conductivities. • Electrons scatter quasi-elastically from acoustic phonons, and inelastically from optical phonons. The scattering rates are evaluated using Fermi’s golden rule applied individually to various phonon modes. • The four ways phonons scatter electrons are via the acoustic deformation potential (ADP), the acoustic piezoelectric interaction (PZ), the optical deformation potential (ODP), and the polar optical phonon (POP).
Further reading The classic reference for phonons in crystals is Dynamical Theory of Crystal Lattices by M. Born and K. Huang. The series Light Scattering by Solids edited by M. Cardona covers the many facets of interaction of photons with phonons. Ziman’s Electrons and Phonons has an unparalleled treatment of electron-phonon interactions looked at from various angles and has Ziman’s signature writing style that is quite a joy to read. Car-
rier Scattering in Metals and Semiconductors by Gantmakher and Levinson has a comprehensive description of electron-phonon interactions in metals and semiconductors. Electrons and Phonons in Semiconductors Multilayers by B. Ridley and Phonons in Nanostructures by M. Stroscio and M. Dutta are texts that go deeper into electron-phonon interactions in semiconductors and their heterostructures.
Exercises (22.1) Phonons carry no net momentum Just as the vibrations of water molecules in a closed bottle do not move the bottle, phonons are internal vibrations of the atoms of a crystal, they do not carry net momentum and do not move the entire crystal. To prove this, note that the NET momentum carried by all phonons in a crystal is M
d dt
∑ us . s
(22.24)
Since the atomic displacements are of a plane-wave
nature us ∝ eiKsa , show that the net momentum is p=M
du dt
du 1 − eiNKa , 1 − eiKa
∑ eisKa = M dt , s
(22.25)
and since for phonon wavevectors K = ±2πr/Na, show that the total momentum carried by the crystal due to phonons is zero. For K = 0, p = N M du dt represents the case when the whole crystal (all the N atoms) are moved at a velocity du dt . (22.2) Raman spectra and the Lyddane–Sachs–Teller re-
552 Exercises lation for GaN and AlN Fig. 22.17 showed the Raman spectra measured for GaN and AlN, two technologically important wide bandgap semiconductor materials. (a) Using the phonon dispersion of GaN shown in Fig. 22.10, identify the phonon modes responsible for each of the three peaks in the Raman spectra. (b) Identify ωTO and ω LO for GaN. If it is known that the DC relative dielectric constant of GaN is edc = 8.9, use the Lyddane–Sachs–Teller relation to find the high-frequency dielectric constant e∞ . Confirm by searching the literature if the value you calculate is indeed experimentally observed. (c) Assigning the same ordering to the optical phonon peaks observed in the Raman spectra of AlN, repeat the above procedure for the dielectric constants of AlN. (d) Explain why the phonon frequencies of AlN are larger than the corresponding frequencies in GaN. Confirm that the toy model of Fig. 22.9 is reasonably accurate and can be used to predict this trend at least qualitatively.
(22.3) Acoustic wave filters Discuss how surface acoustic wave (SAW) and bulk acoustic wave (BAW) filters are used in highfrequency communication systems. SAW filters operate up to ∼ 1 GHz and BAW filters from 1 − 10 GHz. Calculate and plot the center frequency vs. the cavity length for BAW filters for various sound velocities. In addition to the sound velocity an important metric is the electro-mechanical coupling coefficient. Make a table of the materials that allow for the highest performance BAW filters.
(22.4) Calculating the thermal conductivity from the Boltzmann transport equation The Boltzmann transport equation of Chapter 21 provides the recipe to evaluate the thermal conductivity of solids in analogy to the electrical conductivity. (a) Denoting heat current as Q, the nonequilibrium phonon number n β of phonon mode β and equilibrium phonon number n0 , the phonon energy of h¯ ω β and group velocity v β , show that the heat cur-
rent is 1 ∑(nβ − n0 )h¯ ωβ vβ = Ld β 1 ∂T ∂n ph − d∑ h¯ ω β v β , ∂x L β ∂T | {z }
Q=
(22.26)
κ
which is the Fourier law for heat current with thermal conductivity κ. (b) Using further a phonon relaxation time of τph show that the thermal conductivity is obtained in ∂ the steady state when ∂t (...) = 0 as dn ph dt
∂n ph ∂T n ph − n0 = vβ − ph | ∂T {z ∂x} | τ{z } inflow
decay
∂n ph ∂T =⇒ n ph = n0 + v β τph ∂T ∂x 1 1 =⇒ κ = cv vλ = cv v2 τ. 3 3
(22.27)
Note that this is identical to the classical result that was discussed for the thermal conductivity due to conduction electrons in metals in Chapter 2, Equation 2.7. But the velocity v, specific heat cv and scattering time τ here are those of phonons, not of free–electrons. (22.5) Thermal boundary resistance Like light and electrons, phonons undergo reflections and transmission at interfaces. Discuss how this results in a thermal boundary resistance (TBR). Tabulate and compare the values of the TBR for several important interfaces, such as Si/SiO2 and GaN/SiC. Discuss how the TBR is a major bottleneck for several high-performance electronic and photonic devices. (22.6) Normal is not enough: need for Umklapp Consider an electron distribution that is carrying current in a semiconductor in response to an external electric field. If at time t = 0 the field is removed, the electron distribution must return to its equilibrium Fermi–Dirac distribution state. However, it is impossible for this to occur with only normal electron-phonon scattering events; Umklapp processes are necessary for return to equilibrium. A brief proof is outlined here: (a) A normal electron-phonon scattering process follows the momentum conservation relation
Exercises 553 ki ± q = k f , where h¯ ki is the initial electron momentum, h¯ k f is the final electron momentum and h¯ q is the phonon momentum. The energy conservation relation accompanying this process is Eel (k f ) = Eel (ki ) ± E ph (q) where the electron and phonon energies are written out explicitly. Since an Umklapp process involves a phonon of wavevector G where G is a reciprocal lattice vector, the momentum conservation relation changes to ki ± q + G = k f . How is the energy conservation relation modified?
g g
R =q s v current is Q1d L ∑k ≥0 ( E ( k ) − EFs ) v g ( k ) f s ( k ). (a) Calculate the exact net heat current Q1d = R − Q L in the 1D ballistic device. Q1d 1d
(b) Take the degenerate limit of the net heat current assuming a high carrier density and gs = gv = 1.
(b) Discuss, based on the Boltzmann transport equation, why Umklapp processes are needed to bring a non-equilibrium distribution of electrons back to equilibrium.
(c) Now the electrical bias between the source and drain is removed. Show that the thermal conductance GQ = Q1d /∆T = π 2 k2b T0 /3h in 1D for a small temperature difference (∆T T0 ) in the degenerate limit. Calculate the numerical value of GQ at T0 = 4 K and 300 K with proper units. This thermal conductance quantum sets the maximum thermal conduction across a single transport channel.
(c) At very low temperatures Umklapp scattering of electrons in metals is the dominant process leading to electrical resistivity. Discuss why this is the case if the Fermi energy EF spans a significant fraction of the Brillouin zone.
(d) Calculate the ratio GQ /G1d T0 , and show that you obtain the Lorenz number. This is the quantum form of the Wiedemann–Franz law in the ballistic limit. Compare it with the Wiedemann–Franz law in the classical Drude picture.
(d) Discuss why Umklapp processes are responsible for the high-temperature thermal conductivity of sufficiently pure electrically insulating crystals.
We have now seen that electrons produce both charge and heat currents. In metals electrons carry most of the heat, but in semiconductors, electrons carry only a part of the heat; most of the heat is carried by lattice vibrations, or phonons. The acoustic phonon velocity (or sound velocity) vs is the slope of the acoustic phonon energy dispersion E(k) = h¯ vs k. The acoustic modes transport most of the heat. Taking the same 1D semiconductor ballistic device as in this problem, now calculate the heat carried by long wavelength acoustic phonons. Phonons obey Bose–Einstein statistics and their chemical potential is 0.
(22.7) Quantum of thermal conductance In Chapter 17 we saw the quantum limit of the maximum electrical conduction across a single electronic channel is the electrical conductance quantum G1d = q2 /h, where q is the unit charge and h is the Planck constant, which is confirmed experimentally. There is also a similar quantum limits on the thermal conductance. In this problem, you calculate the thermal conductance quantum, and show that the Wiedemann–Franz law of the classical Drude model also holds in the quantum limit. For a 1D conductor connected to a source contact on the left, and a drain contact on the right which is at a voltage V, the right flowing current R = q gs gv is J1d L ∑k ≥0 v g ( k ) f s ( k ) where all symbols have their usual meanings. Now suppose that in addition to the voltage difference, there is also a temperature gradient across the conductor i.e. the source is at a slightly higher temperature (T0 + ∆T) than the drain (T0 ). The temperature gradient gives rise to a heat current. The heat carried by the electron in state k is given by ( E(k) − µ) where µ is the chemical potential of that electronic state. Analogous to the charge current, the right flowing heat
(e) Calculate the net heat current Q1d,ph carried by phonons as you did for electrons, and calculate the thermal conductance GQ in 1D for small temperature difference (∆T T0 ), and find GQ,ph = Q1d,ph /∆T. You should again obtain the thermal conductance quantum GQ,ph = π 2 k2b T0 /3h. (f) This behavior was experimentally confirmed in 2000. Read the paper [K. Schwab, E. Henriksen, J. Worlock, and M. L. Roukes, Measurement of the quantum of thermal conductance, Nature 404, 974 (2000)], and comment on what is measured and what you have calculated. Note: From the above it is found that rather sur-
554 Exercises prisingly, GQ = GQ,ph . Fermions and bosons follow different statistics, yet electrons and phonons end up giving the same thermal conductance quantum in 1D. [This problem was created by Jashan Singhal of Cornell University. See Physical Review Research, volume 2, page 043413 (2020).] (22.8) Simplified explanation of phonon scattering In this problem, a heuristic derivation is given why the linear increase in electrical resistivity in metallic conduction in pure crystals is limited by phonon scattering. (a) Because of thermal vibrations, an atom of mass M can be thought of as connected to springs. Because of the thermal energy k b T, its location is blurred out over an oscillation amplitude u0 . Using 12 Mω 2 u20 ≈ k b T, the atom is blurred over a circle of area σ = πu20 . Estimate this area for say a silicon atom at room temperature. (b) Show then that the mean free path of an electron before it collides with the thermally smeared out ”atom-cloud” is lm f p ≈ σ1 ∝ T1 . (c) Since the electrical conductivity is σ ∝ lm f p , argue why σ ∼ T1 and ρ ∼ T.
(22.9) Strong electron-phonon coupling
In this chapter, we identified the scattering potentials in Section 22.8. In the next Chapter (23) the scattering rates for 3D and 2D electrons by acoustic and optical phonons will be discussed in detail. This problem discusses the case of strong coupling of electrons and phonons that lead to superconductivity. (a) Explore how strong coupling of electrons with phonons can make the interaction between two electrons switch from the standard Coulomb repulsion to – rather astonishingly – to an attractive interaction! (b) The Bardeen–Cooper–Schrieffer (BCS) theory of superconductivity shows that the formation of a large number of composite electron-phononelectron objects, called a Cooper pair is energetically more favorable at a low temperature than the electrons remaining independent. This is somewhat like the condensation of water from vapor into a liquid. Discuss why the electrical resistivity of such a superfluid vanishes. Discuss why this is somewhat counterintuitive: the same phonons that increase the resistivity in normal metallic transport, end up helping to lead to the complete loss of resistivity.
Scattering, Mobility, and Velocity Saturation The mobility of electrons and holes plays a defining role in determining the suitability of a semiconductor for practical applications. Since the electrical conductivity σ = qnµ is the product of the mobile carrier density n and mobility µ, for most applications it is desired to have the highest possible mobility, to achieve the highest electrical conductivity with the least number of mobile electrons (or holes). In this chapter,
• We survey the experimental landscape of electron and hole mobilities in many scenarios ranging from zero/narrow bandgap semiconductors to ultra wide bandgap semiconductors, • Investigate quantitatively the scattering mechanisms that limit the electron mobility, ranging from intrinsic processes such as phonons, to extrinsic processes such as dopants, alloy disorder, defects, and interfaces, and • Investigate the drift velocity as a function of the external electric field to understand velocity saturation and high-field effects.
23 23.1 Electron mobility: a r´esum´e
555
23.2 Scattering mechanisms
557
23.3 Point defect scattering
574
23.4 Coulomb impurity scattering 577 23.5 Dipole scattering
580
23.6 Dislocation scattering
581
23.7 Alloy disorder scattering
583
23.8 Interface scattering
586
23.9 Phonon scattering
588
23.10 Experimental mobilities
602
23.11 High-field velocity saturation 604 23.12 Chapter summary section
607
Further reading
607
Exercises
608
23.1 Electron mobility: a r´esum´e Fig. 23.1 (a) shows a typical trend of electron (or hole) mobilities measured in a semiconductor as a function of temperature, and (b) the mobility as a function of the doping density in the semiconductor. Though the exact numbers may vary, the general trends seen in this figure (with minor variations) are seen in most bulk doped semiconductors, and also in semiconductor heterostructures that house lowerdimensional conduction channels. Thus, the purpose of this chapter is to first qualitatively understand the reason behind these trends, and then quantitatively calculate the mobilities from the Boltzmann transport equation using Fermi’s golden rule for the scattering rates. The mobility at the lowest (cryogenic) temperatures is limited by defect and impurity scattering because both acoustic and optical phonons are frozen out, as discussed in Chapter 22. Lower impurity and defect densities lead to higher mobilities: for example in ultraclean GaAs 2DEGs, mobilities as high as ∼ 107 cm2 /V·s have been achieved, resulting in mean free paths of several mm! At room temperature, the mobilities limited by phonon scattering fall in the ∼ 1000 − 40, 000
Fig. 23.1 Semiconductor mobility trends.
556 Scattering, Mobility, and Velocity Saturation
Fig. 23.2 Conductivity vs. bandgaps of various n-type and p-type doped semiconductors and comparison to metals. Kindly shared by Dr. Kevin Lee of Cornell University. 5
10
8 6 4 2
4
10
8 6 4 2
3
10
8 6 4 2
2
10
11
10
12
10
13
10
14
10
Fig. 23.3 2DEG mobilities vs. carrier densities at room temperature for various semiconductors.
cm2 /V·s range for many useful semiconductors. In most lightly doped compound semiconductors, polar-optical phonon scattering is the dominant mechanism limiting the mobility at room temperature. In elemental non-polar semiconductors such as silicon and germanium this mode is absent, and acoustic phonons dominate. In 2D electron (or hole) gases confined at heterointerfaces such as the silicon MOSFET, the mobility is limited by scattering from the rough interface between the crystalline and amorphous material, limiting it to ∼ 200 cm2 /V·s. But for similar 2DEGs in III-V epitaxial quantum wells, a crystalline heterojunction between nearly lattice-matched semiconductors allows the mobility to reach the intrinsic optical phonon limits, ranging from ∼ 2000 cm2 /V·s in wide bandgap Al(Ga)N/GaN heterostructures to ∼ 40, 000 cm2 /V·s in narrow bandgap InSb quantum wells at 300 K. Fig. 23.1 (b) shows that at room temperature, the mobility in a very lightly doped semiconductor will approach the phonon limit. But as the doping ND is increased beyond a certain density, the Coulomb scattering from the ionized donors takes over. The ionized impurity scattering limited mobility decreases as ∼ 1/ND within a window, before leveling off at the highest doping densities. In semiconductor devices, the heavily doped regions typically form ohmic contacts and access regions where a high conductivity is desired. The much higher carrier densities in such heavily doped regions offset the lower mobilities, and deliver the desired higher net conductivity. Fig. 23.2 shows the n-type and p-type resistivities reported for several semiconductors and metals vs. their energy bandgaps. The resistivities of some metals are also shown for comparison. Typical n-type resistivities are lower than p-type resistivities. The lowest semiconductor resistivities are about 200× higher than the lowest metallic resistivities. The general trend of increasing resistivity with bandgaps is only partially due to low mobilities: the more important challenge is to obtain enough mobile carriers due to deep donors or acceptors. Fig. 23.3 shows the room-temperature electron mobilities in 2D electron gases in some technologically important semiconductors that are used for field-effect transistors as a function of the 2D sheet density. The highest mobilities of electron mobilities of ∼ 10, 000 − 40, 000 cm2 /V·s are achieved in quantum well heterostructures of narrowbandgap semiconductors such as InGaAs/InAlAs and InSb/AlInSb. These are used in transistors which can also be turned on and off due to the presence of a bandgap, albeit narrow. In even narrower bandgap HgCdTe or zero-gap graphene, even higher mobilities can be achieved. But because of the negligible or zero bandgaps, it is not possible to modulate their conductivities enough for transistor action in the field-effect geometry. On the other hand typical 2DEG mobilities in silicon MOSFETs are in the ∼ 200 cm2 /V·s range, and that in AlGaN/GaN 2DEGs in the 1500 cm2 /V·s range. In these semiconductors, the density of 2DEGs can be typically higher than the narrowbandgap semiconductors because they can sustain higher electric fields without breaking down (see Fig. 23.4).
23.2 Scattering mechanisms 557
Fig. 23.4 Room temperature mobilities of 2D electron gases and 2D hole gases vs. the respective sheet carrier densities in various semiconductor families. These transport properties are of high importance in field-effect transistors (FETs) for several applications. Contours of constant sheet resistance are shown in the background, indicating the highest conductivities to the top right. Kindly shared by Dr. Reet Chaudhuri, Cornell University.
Fig. 23.5 shows typical electron and hole mobilities for various semiconductors as a function of the thickness of the semiconductor channel layer. In this case, it is seen that when the thickness approaches nanoscale dimensions, the mobility degrades because of roughness scattering. The class of 2D layered materials do not suffer from this roughness, and can achieve higher mobilities than conventional semiconductors of comparable layer thickness in the thin, nanoscale limit. The range of transport phenomena exhibited by semiconductors is far richer than this short r´esum´e. In this chapter we will be able to sample a large portion, but not all of them. Amorphous and organic semiconductors do not receive the same in-depth treatment as the inorganic semiconductors do in this book. Some semiconductors lose their resistivities completely, and undergo an electronic phase transition into superconductivity: this phenomena is not discussed either. In the next section, we describe the transport theory formalism that will be used to quantitatively calculate the experimental mobilities for various semiconductors. The mobilities will be found to depend on their intrinsic physical properties, such as the bandgaps, effective masses, dielectric constants, deformation potentials, etc. The mobility also depends on the physical conditions such as the temperature, and the defect and impurity densities. Based on these considerations, we classify the scattering mechanisms, before proceeding to evaluating individual scattering mechanisms, and combining them at the end to explain the experimental data described in this short r´esum´e.
23.2 Scattering mechanisms The flowchart in Fig. 23.7 shows the method by which the mobility in semiconductors is evaluated using the relaxation-time approximation
Fig. 23.5 Mobilities vs. channel thickness for various semiconductors. Filled red symbols are silicon electron mobility and empty symbols are silicon hole mobility. SOI stands for silicon on insulator, CNT stands for carbon nanotubes, and on h-BN means graphene on hexagonal boron nitride. For CNTs tch roughly corresponds to the CNT diameter as opposed to the layer thickness elsewhere. Modified figure from C. English, E. Pop et al. Nano Lett. 16, 3824 (2016). Kindly shared by Dr. Eric Pop, Stanford University.
558 Scattering, Mobility, and Velocity Saturation
of the Boltzmann transport equation. Most steps that go into this evaluation have been discussed in earlier chapters. Here we summarize the steps succinctly and proceed directly to evaluate individual scattering rates and the resulting mobility. Some concepts that have not been discussed before, such as screening of scattering potentials are covered and illustrated in the evaluation of the mobility, while others are recalled and refined for the transport problem.
10
0.1
10 -3
10 -5
10 -7
- 20
- 10
10
20
Fig. 23.6 The dependence of mobile electron density in d-dimensions on both the Fermi level EF and on temperature via η = EFk−TEc . A similar relation holds for b mobile holes.
0. Carrier statistics: Because the conductivity σ = qnµ defines the mobility µ = σ/qn, under no circumstance must one naively think that the mobility is independent of the carrier concentration n. The mobility depends on the carrier density n. Neglecting this basic fact sometimes leads to confusion and erroneous conclusions. For example, the mobility of electrons in nearly intrinsic GaAs at 300 K may reach 8000 cm2 /V·s when n is extremely low. If n is increased to 1020 /cm3 by donor doping, the mobility plummets to below 1000 cm2 /V·s due to ionized impurity scattering. On the other hand, for similar volume carrier concentrations realized in 2DEG sheets at AlGaAs/GaAs heterojunctions by field effect or by modulation doping, close to the intrinsic mobility of 8000 cm2 /V·s is achieved by the removal of ionized impurity scattering. Mobility is not a fundamental material parameter, and should be understood to be related to the carrier density in every context. Furthermore, it is important to bear in mind that in semiconductor devices, the mobility typically is fixed once the semiconductor structure has been created, and the knob that makes the device click is not so much the mobility, but the carrier density n, which is tuned by several orders of magnitude either by field-effect in FETs, or by carrier injection in pn diodes, bipolar transistors, LEDs, or lasers. Because one needs mobile carriers in the first place to speak about their mobility, it is worthwhile recalling some concepts of carrier densities in bulk semiconductors, and quantized heterostructures such as quantum wells. The density of molecules in air at room temperature and pressure is roughly 1019 /cm3 . Since the density of electrons in semiconductors reaches similar values at the higher side, it is partly the reason it is referred to as an ”electron gas”. The concentration of mobile electrons in the conduction band depends on the energy separation of the Fermi level and the conduction band edge η = EFk−TEc , which is made nonb dimensional by dividing by the temperature. The carrier density in a parabolic band semiconductor where the electrons are free to move in d-dimensions is given by the following forms from earlier chapters: Z
Z
∞ dd k 0 f = dEk gd ( Ek ) f k0 ∑ k d ( 2π ) E c k 2πm? k b T d EF − Ec ) 2 F d −1 ( ) = Ncd F d −1 (η ), (23.1) 2 |=⇒ {z } nd = gs gv ( 2 2 k T h b | {z }
nd =
2 2
Ek = h¯2mk?
gs gv Ld
f k0 = gs gv
Ncd
23.2 Scattering mechanisms 559
Fig. 23.7 Flow chart for the evaluation of mobility in semiconductors. The last column indicates the information that is necessary to perform the calculations and analysis.
where gd ( Ek ) is the d-dimensional density of states given in Equation 5.93 and f k0 is the Fermi–Dirac distribution, which in the non1 dimensional form reads f k0 (u) = exp[u− , where u = kEkT . In the η ]+1 b RTA solution to the BTE (Equation 21.61), the conductivity always in∂f0
volves the factor − ∂uk that integrates to zero for the carrier density, but is responsible for the current by breaking the symmetry in group velocities in the k-space. Fig. 23.6 shows the dependence of the density nd on η. Note that for non-degenerate concentrations the Fermi level is in the energy gap, and nd ≈ Ncd eη = Ncd e−(Ec − EF )/kb T , and the carrier density depends exponentially on the temperature. For degenerate concentrations, nd ≈ Ncd · η d/2 /Γ((d/2) + 1), in which the temperature dependence cancels, indicating metallic behavior. These two features have a strong effect on the transport properties of carriers. For 3D and 2D cases with gs = 2, gv = 1, the densities are1 n3d = Nc3d F1/2 (η ) and n2d = Nc2d ln(1 + eη ). For the 2D case, η = ( EF − E0 )/k b T where E0 is the lowest subband energy. 1. Scattering potential Wi (r) and screening: The solution of the Schrodinger equation for the periodic crystal potential V0 (r) results in ¨
1 If η is needed as a function of the
carrier density and temperature, for d = 3 this is achieved by inverting the expression n3d = Nc3d F1/2 (η ) by a numerical technique (the Joyce–Dixon approximation): η ' n n3d m 4 ln( 3d 3d ) + ∑ m=1 A m ( 3d ) , where the Nc
Nc
constants Am = 3.536 × 10−1 , −4.950 × 10−3 , 1.484 × 10−4 , −4.426 × 10−6 for m = 1, 2, 3, 4 respectively.
560 Scattering, Mobility, and Velocity Saturation
Fig. 23.8 Classification of scattering mechanisms with potentials Wi (r) for elastic processes, and Wper (r)e±iωt for inelastic processes.
2 Therefore, if the density of defects is
close to the atomic density, this approximation fails. Then one must go back and re-solve the Schrodinger equation with ¨ the scattering potential included directly as part of the crystal potential and not as a perturbation. This may or may not be solvable.
Bloch eigenfunctions, and electron energy eigenvalues which form the electron bandstructure Ek . As discussed in Chapter 13, linear combinations of the Bloch eigenfunctions form the effective mass wavefunctions ψ(r) = u(r)C (r) where u(r) is the cell-periodic part, and C (r) is the envelope function. Because the potential experienced by electrons is quasi-periodic in doped semiconductors and quantized heterostructures used in practical devices, the effective mass form, and the wavepacket picture associated with it is the correct one for the analysis of transport properties. As discussed in Chapters 14 and 21, the presence of ionized dopants and other defects violates the perfectly periodic potential assumption. But if their density is much lower than the atomic density, it is prudent to assume that the crystal potential is still periodic, with only occasional perturbations of the type Wi (r) separated by many many lattice constants2 . This key assumption permits the use of Fermi’s golden rule developed in Chapter 20 to find the scattering rates between effective mass electronic states due to the perturbing potential. A heterostructure leads to a rearrangement of the bandstructure itself, modifying the electronic states from the bulk value to effective mass wavefunctions for the particular heterostructure potential. For example, in heterostructure quantum wells there may be a confining potential and bound states in the z-direction, but plane-waves in the x − y plane. In such heterostructures, the perturbations Wi (r) can be due to interface roughness or dopant potentials, but not the intrinsic heterostructure potential itself, which is periodic and already accounted for in the effective mass solution. Potentials that scatter electrons are of two types: 1) static aperiodic potentials of the type Wi (r) that cause a deviation in the periodic crystal potential that does not change in time, and 2) time-varying periodic potentials of the type Wper (r)e±iωt . As discussed in Chapter 20, the corresponding Fermi’s golden rule takes different forms for the two types of perturbation potentials. Static scattering poten-
23.2 Scattering mechanisms 561
tials of type 1 result in elastic scattering of electrons (and holes), in which the energy of the electron before and after scattering is the same Ek = Ek0 , and only the quasimomentum is changed from k → k0 , re0 2 sulting in a scattering rate Si (k → k0 ) = 2π h¯ |h k |Wi ( r )| k i| δ ( Ek − Ek0 ). Time-varying scatterers of type 2 cause inelastic scattering, changing both the quasimomentum k → k0 and the energy3 by Ek0 = Ek ± h¯ ω wh