Fixed MTP to work with TWRP

This commit is contained in:
awab228 2018-06-19 23:16:04 +02:00
commit f6dfaef42e
50820 changed files with 20846062 additions and 0 deletions

View file

@ -0,0 +1,32 @@
CPU cooling APIs How To
===================================
Written by Amit Daniel Kachhap <amit.kachhap@linaro.org>
Updated: 12 May 2012
Copyright (c) 2012 Samsung Electronics Co., Ltd(http://www.samsung.com)
0. Introduction
The generic cpu cooling(freq clipping) provides registration/unregistration APIs
to the caller. The binding of the cooling devices to the trip point is left for
the user. The registration APIs returns the cooling device pointer.
1. cpu cooling APIs
1.1 cpufreq registration/unregistration APIs
1.1.1 struct thermal_cooling_device *cpufreq_cooling_register(
struct cpumask *clip_cpus)
This interface function registers the cpufreq cooling device with the name
"thermal-cpufreq-%x". This api can support multiple instances of cpufreq
cooling devices.
clip_cpus: cpumask of cpus where the frequency constraints will happen.
1.1.2 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
This interface function unregisters the "thermal-cpufreq-%x" cooling device.
cdev: Cooling device pointer which has to be unregistered.

View file

@ -0,0 +1,77 @@
Kernel driver exynos_tmu
=================
Supported chips:
* ARM SAMSUNG EXYNOS4, EXYNOS5 series of SoC
Datasheet: Not publicly available
Authors: Donggeun Kim <dg77.kim@samsung.com>
Authors: Amit Daniel <amit.daniel@samsung.com>
TMU controller Description:
---------------------------
This driver allows to read temperature inside SAMSUNG EXYNOS4/5 series of SoC.
The chip only exposes the measured 8-bit temperature code value
through a register.
Temperature can be taken from the temperature code.
There are three equations converting from temperature to temperature code.
The three equations are:
1. Two point trimming
Tc = (T - 25) * (TI2 - TI1) / (85 - 25) + TI1
2. One point trimming
Tc = T + TI1 - 25
3. No trimming
Tc = T + 50
Tc: Temperature code, T: Temperature,
TI1: Trimming info for 25 degree Celsius (stored at TRIMINFO register)
Temperature code measured at 25 degree Celsius which is unchanged
TI2: Trimming info for 85 degree Celsius (stored at TRIMINFO register)
Temperature code measured at 85 degree Celsius which is unchanged
TMU(Thermal Management Unit) in EXYNOS4/5 generates interrupt
when temperature exceeds pre-defined levels.
The maximum number of configurable threshold is five.
The threshold levels are defined as follows:
Level_0: current temperature > trigger_level_0 + threshold
Level_1: current temperature > trigger_level_1 + threshold
Level_2: current temperature > trigger_level_2 + threshold
Level_3: current temperature > trigger_level_3 + threshold
The threshold and each trigger_level are set
through the corresponding registers.
When an interrupt occurs, this driver notify kernel thermal framework
with the function exynos_report_trigger.
Although an interrupt condition for level_0 can be set,
it can be used to synchronize the cooling action.
TMU driver description:
-----------------------
The exynos thermal driver is structured as,
Kernel Core thermal framework
(thermal_core.c, step_wise.c, cpu_cooling.c)
^
|
|
TMU configuration data -------> TMU Driver <------> Exynos Core thermal wrapper
(exynos_tmu_data.c) (exynos_tmu.c) (exynos_thermal_common.c)
(exynos_tmu_data.h) (exynos_tmu.h) (exynos_thermal_common.h)
a) TMU configuration data: This consist of TMU register offsets/bitfields
described through structure exynos_tmu_registers. Also several
other platform data (struct exynos_tmu_platform_data) members
are used to configure the TMU.
b) TMU driver: This component initialises the TMU controller and sets different
thresholds. It invokes core thermal implementation with the call
exynos_report_trigger.
c) Exynos Core thermal wrapper: This provides 3 wrapper function to use the
Kernel core thermal framework. They are exynos_unregister_thermal,
exynos_register_thermal and exynos_report_trigger.

View file

@ -0,0 +1,53 @@
EXYNOS EMULATION MODE
========================
Copyright (C) 2012 Samsung Electronics
Written by Jonghwa Lee <jonghwa3.lee@samsung.com>
Description
-----------
Exynos 4x12 (4212, 4412) and 5 series provide emulation mode for thermal management unit.
Thermal emulation mode supports software debug for TMU's operation. User can set temperature
manually with software code and TMU will read current temperature from user value not from
sensor's value.
Enabling CONFIG_THERMAL_EMULATION option will make this support available.
When it's enabled, sysfs node will be created as
/sys/devices/virtual/thermal/thermal_zone'zone id'/emul_temp.
The sysfs node, 'emul_node', will contain value 0 for the initial state. When you input any
temperature you want to update to sysfs node, it automatically enable emulation mode and
current temperature will be changed into it.
(Exynos also supports user changeable delay time which would be used to delay of
changing temperature. However, this node only uses same delay of real sensing time, 938us.)
Exynos emulation mode requires synchronous of value changing and enabling. It means when you
want to update the any value of delay or next temperature, then you have to enable emulation
mode at the same time. (Or you have to keep the mode enabling.) If you don't, it fails to
change the value to updated one and just use last succeessful value repeatedly. That's why
this node gives users the right to change termerpature only. Just one interface makes it more
simply to use.
Disabling emulation mode only requires writing value 0 to sysfs node.
TEMP 120 |
|
100 |
|
80 |
| +-----------
60 | | |
| +-------------| |
40 | | | |
| | | |
20 | | | +----------
| | | | |
0 |______________|_____________|__________|__________|_________
A A A A TIME
|<----->| |<----->| |<----->| |
| 938us | | | | | |
emulation : 0 50 | 70 | 20 | 0
current temp : sensor 50 70 20 sensor

View file

@ -0,0 +1,307 @@
=======================
INTEL POWERCLAMP DRIVER
=======================
By: Arjan van de Ven <arjan@linux.intel.com>
Jacob Pan <jacob.jun.pan@linux.intel.com>
Contents:
(*) Introduction
- Goals and Objectives
(*) Theory of Operation
- Idle Injection
- Calibration
(*) Performance Analysis
- Effectiveness and Limitations
- Power vs Performance
- Scalability
- Calibration
- Comparison with Alternative Techniques
(*) Usage and Interfaces
- Generic Thermal Layer (sysfs)
- Kernel APIs (TBD)
============
INTRODUCTION
============
Consider the situation where a systems power consumption must be
reduced at runtime, due to power budget, thermal constraint, or noise
level, and where active cooling is not preferred. Software managed
passive power reduction must be performed to prevent the hardware
actions that are designed for catastrophic scenarios.
Currently, P-states, T-states (clock modulation), and CPU offlining
are used for CPU throttling.
On Intel CPUs, C-states provide effective power reduction, but so far
theyre only used opportunistically, based on workload. With the
development of intel_powerclamp driver, the method of synchronizing
idle injection across all online CPU threads was introduced. The goal
is to achieve forced and controllable C-state residency.
Test/Analysis has been made in the areas of power, performance,
scalability, and user experience. In many cases, clear advantage is
shown over taking the CPU offline or modulating the CPU clock.
===================
THEORY OF OPERATION
===================
Idle Injection
--------------
On modern Intel processors (Nehalem or later), package level C-state
residency is available in MSRs, thus also available to the kernel.
These MSRs are:
#define MSR_PKG_C2_RESIDENCY 0x60D
#define MSR_PKG_C3_RESIDENCY 0x3F8
#define MSR_PKG_C6_RESIDENCY 0x3F9
#define MSR_PKG_C7_RESIDENCY 0x3FA
If the kernel can also inject idle time to the system, then a
closed-loop control system can be established that manages package
level C-state. The intel_powerclamp driver is conceived as such a
control system, where the target set point is a user-selected idle
ratio (based on power reduction), and the error is the difference
between the actual package level C-state residency ratio and the target idle
ratio.
Injection is controlled by high priority kernel threads, spawned for
each online CPU.
These kernel threads, with SCHED_FIFO class, are created to perform
clamping actions of controlled duty ratio and duration. Each per-CPU
thread synchronizes its idle time and duration, based on the rounding
of jiffies, so accumulated errors can be prevented to avoid a jittery
effect. Threads are also bound to the CPU such that they cannot be
migrated, unless the CPU is taken offline. In this case, threads
belong to the offlined CPUs will be terminated immediately.
Running as SCHED_FIFO and relatively high priority, also allows such
scheme to work for both preemptable and non-preemptable kernels.
Alignment of idle time around jiffies ensures scalability for HZ
values. This effect can be better visualized using a Perf timechart.
The following diagram shows the behavior of kernel thread
kidle_inject/cpu. During idle injection, it runs monitor/mwait idle
for a given "duration", then relinquishes the CPU to other tasks,
until the next time interval.
The NOHZ schedule tick is disabled during idle time, but interrupts
are not masked. Tests show that the extra wakeups from scheduler tick
have a dramatic impact on the effectiveness of the powerclamp driver
on large scale systems (Westmere system with 80 processors).
CPU0
____________ ____________
kidle_inject/0 | sleep | mwait | sleep |
_________| |________| |_______
duration
CPU1
____________ ____________
kidle_inject/1 | sleep | mwait | sleep |
_________| |________| |_______
^
|
|
roundup(jiffies, interval)
Only one CPU is allowed to collect statistics and update global
control parameters. This CPU is referred to as the controlling CPU in
this document. The controlling CPU is elected at runtime, with a
policy that favors BSP, taking into account the possibility of a CPU
hot-plug.
In terms of dynamics of the idle control system, package level idle
time is considered largely as a non-causal system where its behavior
cannot be based on the past or current input. Therefore, the
intel_powerclamp driver attempts to enforce the desired idle time
instantly as given input (target idle ratio). After injection,
powerclamp moniors the actual idle for a given time window and adjust
the next injection accordingly to avoid over/under correction.
When used in a causal control system, such as a temperature control,
it is up to the user of this driver to implement algorithms where
past samples and outputs are included in the feedback. For example, a
PID-based thermal controller can use the powerclamp driver to
maintain a desired target temperature, based on integral and
derivative gains of the past samples.
Calibration
-----------
During scalability testing, it is observed that synchronized actions
among CPUs become challenging as the number of cores grows. This is
also true for the ability of a system to enter package level C-states.
To make sure the intel_powerclamp driver scales well, online
calibration is implemented. The goals for doing such a calibration
are:
a) determine the effective range of idle injection ratio
b) determine the amount of compensation needed at each target ratio
Compensation to each target ratio consists of two parts:
a) steady state error compensation
This is to offset the error occurring when the system can
enter idle without extra wakeups (such as external interrupts).
b) dynamic error compensation
When an excessive amount of wakeups occurs during idle, an
additional idle ratio can be added to quiet interrupts, by
slowing down CPU activities.
A debugfs file is provided for the user to examine compensation
progress and results, such as on a Westmere system.
[jacob@nex01 ~]$ cat
/sys/kernel/debug/intel_powerclamp/powerclamp_calib
controlling cpu: 0
pct confidence steady dynamic (compensation)
0 0 0 0
1 1 0 0
2 1 1 0
3 3 1 0
4 3 1 0
5 3 1 0
6 3 1 0
7 3 1 0
8 3 1 0
...
30 3 2 0
31 3 2 0
32 3 1 0
33 3 2 0
34 3 1 0
35 3 2 0
36 3 1 0
37 3 2 0
38 3 1 0
39 3 2 0
40 3 3 0
41 3 1 0
42 3 2 0
43 3 1 0
44 3 1 0
45 3 2 0
46 3 3 0
47 3 0 0
48 3 2 0
49 3 3 0
Calibration occurs during runtime. No offline method is available.
Steady state compensation is used only when confidence levels of all
adjacent ratios have reached satisfactory level. A confidence level
is accumulated based on clean data collected at runtime. Data
collected during a period without extra interrupts is considered
clean.
To compensate for excessive amounts of wakeup during idle, additional
idle time is injected when such a condition is detected. Currently,
we have a simple algorithm to double the injection ratio. A possible
enhancement might be to throttle the offending IRQ, such as delaying
EOI for level triggered interrupts. But it is a challenge to be
non-intrusive to the scheduler or the IRQ core code.
CPU Online/Offline
------------------
Per-CPU kernel threads are started/stopped upon receiving
notifications of CPU hotplug activities. The intel_powerclamp driver
keeps track of clamping kernel threads, even after they are migrated
to other CPUs, after a CPU offline event.
=====================
Performance Analysis
=====================
This section describes the general performance data collected on
multiple systems, including Westmere (80P) and Ivy Bridge (4P, 8P).
Effectiveness and Limitations
-----------------------------
The maximum range that idle injection is allowed is capped at 50
percent. As mentioned earlier, since interrupts are allowed during
forced idle time, excessive interrupts could result in less
effectiveness. The extreme case would be doing a ping -f to generated
flooded network interrupts without much CPU acknowledgement. In this
case, little can be done from the idle injection threads. In most
normal cases, such as scp a large file, applications can be throttled
by the powerclamp driver, since slowing down the CPU also slows down
network protocol processing, which in turn reduces interrupts.
When control parameters change at runtime by the controlling CPU, it
may take an additional period for the rest of the CPUs to catch up
with the changes. During this time, idle injection is out of sync,
thus not able to enter package C- states at the expected ratio. But
this effect is minor, in that in most cases change to the target
ratio is updated much less frequently than the idle injection
frequency.
Scalability
-----------
Tests also show a minor, but measurable, difference between the 4P/8P
Ivy Bridge system and the 80P Westmere server under 50% idle ratio.
More compensation is needed on Westmere for the same amount of
target idle ratio. The compensation also increases as the idle ratio
gets larger. The above reason constitutes the need for the
calibration code.
On the IVB 8P system, compared to an offline CPU, powerclamp can
achieve up to 40% better performance per watt. (measured by a spin
counter summed over per CPU counting threads spawned for all running
CPUs).
====================
Usage and Interfaces
====================
The powerclamp driver is registered to the generic thermal layer as a
cooling device. Currently, its not bound to any thermal zones.
jacob@chromoly:/sys/class/thermal/cooling_device14$ grep . *
cur_state:0
max_state:50
type:intel_powerclamp
Example usage:
- To inject 25% idle time
$ sudo sh -c "echo 25 > /sys/class/thermal/cooling_device80/cur_state
"
If the system is not busy and has more than 25% idle time already,
then the powerclamp driver will not start idle injection. Using Top
will not show idle injection kernel threads.
If the system is busy (spin test below) and has less than 25% natural
idle time, powerclamp kernel threads will do idle injection, which
appear running to the scheduler. But the overall system idle is still
reflected. In this example, 24.1% idle is shown. This helps the
system admin or user determine the cause of slowdown, when a
powerclamp driver is in action.
Tasks: 197 total, 1 running, 196 sleeping, 0 stopped, 0 zombie
Cpu(s): 71.2%us, 4.7%sy, 0.0%ni, 24.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 3943228k total, 1689632k used, 2253596k free, 74960k buffers
Swap: 4087804k total, 0k used, 4087804k free, 945336k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3352 jacob 20 0 262m 644 428 S 286 0.0 0:17.16 spin
3341 root -51 0 0 0 0 D 25 0.0 0:01.62 kidle_inject/0
3344 root -51 0 0 0 0 D 25 0.0 0:01.60 kidle_inject/3
3342 root -51 0 0 0 0 D 25 0.0 0:01.61 kidle_inject/1
3343 root -51 0 0 0 0 D 25 0.0 0:01.60 kidle_inject/2
2935 jacob 20 0 696m 125m 35m S 5 3.3 0:31.11 firefox
1546 root 20 0 158m 20m 6640 S 3 0.5 0:26.97 Xorg
2100 jacob 20 0 1223m 88m 30m S 3 2.3 0:23.68 compiz
Tests have shown that by using the powerclamp driver as a cooling
device, a PID based userspace thermal controller can manage to
control CPU temperature effectively, when no other thermal influence
is added. For example, a UltraBook user can compile the kernel under
certain temperature (below most active trip points).

View file

@ -0,0 +1,82 @@
Kernel driver nouveau
===================
Supported chips:
* NV43+
Authors: Martin Peres (mupuf) <martin.peres@free.fr>
Description
---------
This driver allows to read the GPU core temperature, drive the GPU fan and
set temperature alarms.
Currently, due to the absence of in-kernel API to access HWMON drivers, Nouveau
cannot access any of the i2c external monitoring chips it may find. If you
have one of those, temperature and/or fan management through Nouveau's HWMON
interface is likely not to work. This document may then not cover your situation
entirely.
Temperature management
--------------------
Temperature is exposed under as a read-only HWMON attribute temp1_input.
In order to protect the GPU from overheating, Nouveau supports 4 configurable
temperature thresholds:
* Fan_boost: Fan speed is set to 100% when reaching this temperature;
* Downclock: The GPU will be downclocked to reduce its power dissipation;
* Critical: The GPU is put on hold to further lower power dissipation;
* Shutdown: Shut the computer down to protect your GPU.
WARNING: Some of these thresholds may not be used by Nouveau depending
on your chipset.
The default value for these thresholds comes from the GPU's vbios. These
thresholds can be configured thanks to the following HWMON attributes:
* Fan_boost: temp1_auto_point1_temp and temp1_auto_point1_temp_hyst;
* Downclock: temp1_max and temp1_max_hyst;
* Critical: temp1_crit and temp1_crit_hyst;
* Shutdown: temp1_emergency and temp1_emergency_hyst.
NOTE: Remember that the values are stored as milli degrees Celcius. Don't forget
to multiply!
Fan management
------------
Not all cards have a drivable fan. If you do, then the following HWMON
attributes should be available:
* pwm1_enable: Current fan management mode (NONE, MANUAL or AUTO);
* pwm1: Current PWM value (power percentage);
* pwm1_min: The minimum PWM speed allowed;
* pwm1_max: The maximum PWM speed allowed (bypassed when hitting Fan_boost);
You may also have the following attribute:
* fan1_input: Speed in RPM of your fan.
Your fan can be driven in different modes:
* 0: The fan is left untouched;
* 1: The fan can be driven in manual (use pwm1 to change the speed);
* 2; The fan is driven automatically depending on the temperature.
NOTE: Be sure to use the manual mode if you want to drive the fan speed manually
NOTE2: When operating in manual mode outside the vbios-defined
[PWM_min, PWM_max] range, the reported fan speed (RPM) may not be accurate
depending on your hardware.
Bug reports
---------
Thermal management on Nouveau is new and may not work on all cards. If you have
inquiries, please ping mupuf on IRC (#nouveau, freenode).
Bug reports should be filled on Freedesktop's bug tracker. Please follow
http://nouveau.freedesktop.org/wiki/Bugs

View file

@ -0,0 +1,401 @@
Generic Thermal Sysfs driver How To
===================================
Written by Sujith Thomas <sujith.thomas@intel.com>, Zhang Rui <rui.zhang@intel.com>
Updated: 2 January 2008
Copyright (c) 2008 Intel Corporation
0. Introduction
The generic thermal sysfs provides a set of interfaces for thermal zone
devices (sensors) and thermal cooling devices (fan, processor...) to register
with the thermal management solution and to be a part of it.
This how-to focuses on enabling new thermal zone and cooling devices to
participate in thermal management.
This solution is platform independent and any type of thermal zone devices
and cooling devices should be able to make use of the infrastructure.
The main task of the thermal sysfs driver is to expose thermal zone attributes
as well as cooling device attributes to the user space.
An intelligent thermal management application can make decisions based on
inputs from thermal zone attributes (the current temperature and trip point
temperature) and throttle appropriate devices.
[0-*] denotes any positive number starting from 0
[1-*] denotes any positive number starting from 1
1. thermal sysfs driver interface functions
1.1 thermal zone device interface
1.1.1 struct thermal_zone_device *thermal_zone_device_register(char *type,
int trips, int mask, void *devdata,
struct thermal_zone_device_ops *ops,
const struct thermal_zone_params *tzp,
int passive_delay, int polling_delay))
This interface function adds a new thermal zone device (sensor) to
/sys/class/thermal folder as thermal_zone[0-*]. It tries to bind all the
thermal cooling devices registered at the same time.
type: the thermal zone type.
trips: the total number of trip points this thermal zone supports.
mask: Bit string: If 'n'th bit is set, then trip point 'n' is writeable.
devdata: device private data
ops: thermal zone device call-backs.
.bind: bind the thermal zone device with a thermal cooling device.
.unbind: unbind the thermal zone device with a thermal cooling device.
.get_temp: get the current temperature of the thermal zone.
.get_mode: get the current mode (enabled/disabled) of the thermal zone.
- "enabled" means the kernel thermal management is enabled.
- "disabled" will prevent kernel thermal driver action upon trip points
so that user applications can take charge of thermal management.
.set_mode: set the mode (enabled/disabled) of the thermal zone.
.get_trip_type: get the type of certain trip point.
.get_trip_temp: get the temperature above which the certain trip point
will be fired.
.set_emul_temp: set the emulation temperature which helps in debugging
different threshold temperature points.
tzp: thermal zone platform parameters.
passive_delay: number of milliseconds to wait between polls when
performing passive cooling.
polling_delay: number of milliseconds to wait between polls when checking
whether trip points have been crossed (0 for interrupt driven systems).
1.1.2 void thermal_zone_device_unregister(struct thermal_zone_device *tz)
This interface function removes the thermal zone device.
It deletes the corresponding entry form /sys/class/thermal folder and
unbind all the thermal cooling devices it uses.
1.2 thermal cooling device interface
1.2.1 struct thermal_cooling_device *thermal_cooling_device_register(char *name,
void *devdata, struct thermal_cooling_device_ops *)
This interface function adds a new thermal cooling device (fan/processor/...)
to /sys/class/thermal/ folder as cooling_device[0-*]. It tries to bind itself
to all the thermal zone devices register at the same time.
name: the cooling device name.
devdata: device private data.
ops: thermal cooling devices call-backs.
.get_max_state: get the Maximum throttle state of the cooling device.
.get_cur_state: get the Current throttle state of the cooling device.
.set_cur_state: set the Current throttle state of the cooling device.
1.2.2 void thermal_cooling_device_unregister(struct thermal_cooling_device *cdev)
This interface function remove the thermal cooling device.
It deletes the corresponding entry form /sys/class/thermal folder and
unbind itself from all the thermal zone devices using it.
1.3 interface for binding a thermal zone device with a thermal cooling device
1.3.1 int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
int trip, struct thermal_cooling_device *cdev,
unsigned long upper, unsigned long lower);
This interface function bind a thermal cooling device to the certain trip
point of a thermal zone device.
This function is usually called in the thermal zone device .bind callback.
tz: the thermal zone device
cdev: thermal cooling device
trip: indicates which trip point the cooling devices is associated with
in this thermal zone.
upper:the Maximum cooling state for this trip point.
THERMAL_NO_LIMIT means no upper limit,
and the cooling device can be in max_state.
lower:the Minimum cooling state can be used for this trip point.
THERMAL_NO_LIMIT means no lower limit,
and the cooling device can be in cooling state 0.
1.3.2 int thermal_zone_unbind_cooling_device(struct thermal_zone_device *tz,
int trip, struct thermal_cooling_device *cdev);
This interface function unbind a thermal cooling device from the certain
trip point of a thermal zone device. This function is usually called in
the thermal zone device .unbind callback.
tz: the thermal zone device
cdev: thermal cooling device
trip: indicates which trip point the cooling devices is associated with
in this thermal zone.
1.4 Thermal Zone Parameters
1.4.1 struct thermal_bind_params
This structure defines the following parameters that are used to bind
a zone with a cooling device for a particular trip point.
.cdev: The cooling device pointer
.weight: The 'influence' of a particular cooling device on this zone.
This is on a percentage scale. The sum of all these weights
(for a particular zone) cannot exceed 100.
.trip_mask:This is a bit mask that gives the binding relation between
this thermal zone and cdev, for a particular trip point.
If nth bit is set, then the cdev and thermal zone are bound
for trip point n.
.limits: This is an array of cooling state limits. Must have exactly
2 * thermal_zone.number_of_trip_points. It is an array consisting
of tuples <lower-state upper-state> of state limits. Each trip
will be associated with one state limit tuple when binding.
A NULL pointer means <THERMAL_NO_LIMITS THERMAL_NO_LIMITS>
on all trips. These limits are used when binding a cdev to a
trip point.
.match: This call back returns success(0) if the 'tz and cdev' need to
be bound, as per platform data.
1.4.2 struct thermal_zone_params
This structure defines the platform level parameters for a thermal zone.
This data, for each thermal zone should come from the platform layer.
This is an optional feature where some platforms can choose not to
provide this data.
.governor_name: Name of the thermal governor used for this zone
.no_hwmon: a boolean to indicate if the thermal to hwmon sysfs interface
is required. when no_hwmon == false, a hwmon sysfs interface
will be created. when no_hwmon == true, nothing will be done.
In case the thermal_zone_params is NULL, the hwmon interface
will be created (for backward compatibility).
.num_tbps: Number of thermal_bind_params entries for this zone
.tbp: thermal_bind_params entries
2. sysfs attributes structure
RO read only value
RW read/write value
Thermal sysfs attributes will be represented under /sys/class/thermal.
Hwmon sysfs I/F extension is also available under /sys/class/hwmon
if hwmon is compiled in or built as a module.
Thermal zone device sys I/F, created once it's registered:
/sys/class/thermal/thermal_zone[0-*]:
|---type: Type of the thermal zone
|---temp: Current temperature
|---mode: Working mode of the thermal zone
|---policy: Thermal governor used for this zone
|---trip_point_[0-*]_temp: Trip point temperature
|---trip_point_[0-*]_type: Trip point type
|---trip_point_[0-*]_hyst: Hysteresis value for this trip point
|---emul_temp: Emulated temperature set node
Thermal cooling device sys I/F, created once it's registered:
/sys/class/thermal/cooling_device[0-*]:
|---type: Type of the cooling device(processor/fan/...)
|---max_state: Maximum cooling state of the cooling device
|---cur_state: Current cooling state of the cooling device
Then next two dynamic attributes are created/removed in pairs. They represent
the relationship between a thermal zone and its associated cooling device.
They are created/removed for each successful execution of
thermal_zone_bind_cooling_device/thermal_zone_unbind_cooling_device.
/sys/class/thermal/thermal_zone[0-*]:
|---cdev[0-*]: [0-*]th cooling device in current thermal zone
|---cdev[0-*]_trip_point: Trip point that cdev[0-*] is associated with
Besides the thermal zone device sysfs I/F and cooling device sysfs I/F,
the generic thermal driver also creates a hwmon sysfs I/F for each _type_
of thermal zone device. E.g. the generic thermal driver registers one hwmon
class device and build the associated hwmon sysfs I/F for all the registered
ACPI thermal zones.
/sys/class/hwmon/hwmon[0-*]:
|---name: The type of the thermal zone devices
|---temp[1-*]_input: The current temperature of thermal zone [1-*]
|---temp[1-*]_critical: The critical trip point of thermal zone [1-*]
Please read Documentation/hwmon/sysfs-interface for additional information.
***************************
* Thermal zone attributes *
***************************
type
Strings which represent the thermal zone type.
This is given by thermal zone driver as part of registration.
E.g: "acpitz" indicates it's an ACPI thermal device.
In order to keep it consistent with hwmon sys attribute; this should
be a short, lowercase string, not containing spaces nor dashes.
RO, Required
temp
Current temperature as reported by thermal zone (sensor).
Unit: millidegree Celsius
RO, Required
mode
One of the predefined values in [enabled, disabled].
This file gives information about the algorithm that is currently
managing the thermal zone. It can be either default kernel based
algorithm or user space application.
enabled = enable Kernel Thermal management.
disabled = Preventing kernel thermal zone driver actions upon
trip points so that user application can take full
charge of the thermal management.
RW, Optional
policy
One of the various thermal governors used for a particular zone.
RW, Required
trip_point_[0-*]_temp
The temperature above which trip point will be fired.
Unit: millidegree Celsius
RO, Optional
trip_point_[0-*]_type
Strings which indicate the type of the trip point.
E.g. it can be one of critical, hot, passive, active[0-*] for ACPI
thermal zone.
RO, Optional
trip_point_[0-*]_hyst
The hysteresis value for a trip point, represented as an integer
Unit: Celsius
RW, Optional
cdev[0-*]
Sysfs link to the thermal cooling device node where the sys I/F
for cooling device throttling control represents.
RO, Optional
cdev[0-*]_trip_point
The trip point with which cdev[0-*] is associated in this thermal
zone; -1 means the cooling device is not associated with any trip
point.
RO, Optional
passive
Attribute is only present for zones in which the passive cooling
policy is not supported by native thermal driver. Default is zero
and can be set to a temperature (in millidegrees) to enable a
passive trip point for the zone. Activation is done by polling with
an interval of 1 second.
Unit: millidegrees Celsius
Valid values: 0 (disabled) or greater than 1000
RW, Optional
emul_temp
Interface to set the emulated temperature method in thermal zone
(sensor). After setting this temperature, the thermal zone may pass
this temperature to platform emulation function if registered or
cache it locally. This is useful in debugging different temperature
threshold and its associated cooling action. This is write only node
and writing 0 on this node should disable emulation.
Unit: millidegree Celsius
WO, Optional
WARNING: Be careful while enabling this option on production systems,
because userland can easily disable the thermal policy by simply
flooding this sysfs node with low temperature values.
*****************************
* Cooling device attributes *
*****************************
type
String which represents the type of device, e.g:
- for generic ACPI: should be "Fan", "Processor" or "LCD"
- for memory controller device on intel_menlow platform:
should be "Memory controller".
RO, Required
max_state
The maximum permissible cooling state of this cooling device.
RO, Required
cur_state
The current cooling state of this cooling device.
The value can any integer numbers between 0 and max_state:
- cur_state == 0 means no cooling
- cur_state == max_state means the maximum cooling.
RW, Required
3. A simple implementation
ACPI thermal zone may support multiple trip points like critical, hot,
passive, active. If an ACPI thermal zone supports critical, passive,
active[0] and active[1] at the same time, it may register itself as a
thermal_zone_device (thermal_zone1) with 4 trip points in all.
It has one processor and one fan, which are both registered as
thermal_cooling_device.
If the processor is listed in _PSL method, and the fan is listed in _AL0
method, the sys I/F structure will be built like this:
/sys/class/thermal:
|thermal_zone1:
|---type: acpitz
|---temp: 37000
|---mode: enabled
|---policy: step_wise
|---trip_point_0_temp: 100000
|---trip_point_0_type: critical
|---trip_point_1_temp: 80000
|---trip_point_1_type: passive
|---trip_point_2_temp: 70000
|---trip_point_2_type: active0
|---trip_point_3_temp: 60000
|---trip_point_3_type: active1
|---cdev0: --->/sys/class/thermal/cooling_device0
|---cdev0_trip_point: 1 /* cdev0 can be used for passive */
|---cdev1: --->/sys/class/thermal/cooling_device3
|---cdev1_trip_point: 2 /* cdev1 can be used for active[0]*/
|cooling_device0:
|---type: Processor
|---max_state: 8
|---cur_state: 0
|cooling_device3:
|---type: Fan
|---max_state: 2
|---cur_state: 0
/sys/class/hwmon:
|hwmon0:
|---name: acpitz
|---temp1_input: 37000
|---temp1_crit: 100000
4. Event Notification
The framework includes a simple notification mechanism, in the form of a
netlink event. Netlink socket initialization is done during the _init_
of the framework. Drivers which intend to use the notification mechanism
just need to call thermal_generate_netlink_event() with two arguments viz
(originator, event). The originator is a pointer to struct thermal_zone_device
from where the event has been originated. An integer which represents the
thermal zone device will be used in the message to identify the zone. The
event will be one of:{THERMAL_AUX0, THERMAL_AUX1, THERMAL_CRITICAL,
THERMAL_DEV_FAULT}. Notification can be sent when the current temperature
crosses any of the configured thresholds.
5. Export Symbol APIs:
5.1: get_tz_trend:
This function returns the trend of a thermal zone, i.e the rate of change
of temperature of the thermal zone. Ideally, the thermal sensor drivers
are supposed to implement the callback. If they don't, the thermal
framework calculated the trend by comparing the previous and the current
temperature values.
5.2:get_thermal_instance:
This function returns the thermal_instance corresponding to a given
{thermal_zone, cooling_device, trip_point} combination. Returns NULL
if such an instance does not exist.
5.3:thermal_notify_framework:
This function handles the trip events from sensor drivers. It starts
throttling the cooling devices according to the policy configured.
For CRITICAL and HOT trip points, this notifies the respective drivers,
and does actual throttling for other trip points i.e ACTIVE and PASSIVE.
The throttling policy is based on the configured platform data; if no
platform data is provided, this uses the step_wise throttling policy.
5.4:thermal_cdev_update:
This function serves as an arbitrator to set the state of a cooling
device. It sets the cooling device to the deepest cooling state if
possible.

View file

@ -0,0 +1,47 @@
Kernel driver: x86_pkg_temp_thermal
===================
Supported chips:
* x86: with package level thermal management
(Verify using: CPUID.06H:EAX[bit 6] =1)
Authors: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reference
---
Intel® 64 and IA-32 Architectures Software Developers Manual (Jan, 2013):
Chapter 14.6: PACKAGE LEVEL THERMAL MANAGEMENT
Description
---------
This driver register CPU digital temperature package level sensor as a thermal
zone with maximum two user mode configurable trip points. Number of trip points
depends on the capability of the package. Once the trip point is violated,
user mode can receive notification via thermal notification mechanism and can
take any action to control temperature.
Threshold management
--------------------
Each package will register as a thermal zone under /sys/class/thermal.
Example:
/sys/class/thermal/thermal_zone1
This contains two trip points:
- trip_point_0_temp
- trip_point_1_temp
User can set any temperature between 0 to TJ-Max temperature. Temperature units
are in milli-degree Celsius. Refer to "Documentation/thermal/sysfs-api.txt" for
thermal sys-fs details.
Any value other than 0 in these trip points, can trigger thermal notifications.
Setting 0, stops sending thermal notifications.
Thermal notifications: To get kobject-uevent notifications, set the thermal zone
policy to "user_space". For example: echo -n "user_space" > policy