Go源码分析之浮点型处理

2020年12月02日留言

相除和取模有什么区别？

相除和取模是两种不同的数学运算。

相除（除法）是指将一个数除以另一个数，得到商的过程。例如，10除以3，商为3，余数为1。在数学中，除法通常使用符号 / 表示。

取模（取余）是指将一个数除以另一个数，得到余数的过程。例如，10除以3，余数为1。在数学中，取模通常使用符号 % 表示。

在编程中，相除和取模运算经常用于处理整数类型的数据。

例如，在计算机编程中，可以使用相除运算来计算两个数的商，而使用取模运算来计算两个数的余数。

以下是一个示例：

a = 10
b = 3

quotient = a / b  # 相除运算，得到商
remainder = a % b  # 取模运算，得到余数

print("商:", quotient)  # 输出：商: 3.3333333333333335
print("余数:", remainder)  # 输出：余数: 1

在上述示例中，a / b 进行相除运算得到商，结果为3.3333333333333335，即10除以3的浮点数商。a % b 进行取模运算得到余数，结果为1，即10除以3的余数。

需要注意的是，相除运算的结果可能是浮点数（如上述示例中的商），而取模运算的结果总是整数。取模运算的结果与被除数的符号相同。

总结：

相除运算是计算除法的商，结果可能是浮点数。
取模运算是计算除法的余数，结果总是整数。

golang这为什么相除不会出现小数

package main

import (
	"fmt"
)

func main() {

	fmt.Println(1 / 2)
	fmt.Println(11 / 5)
	fmt.Println(11 % 5)

}

package main

import "fmt"

func main() {
	a := 12
	b := 5

	quotient := float64(a) / float64(b) // 将操作数转换为浮点数，进行相除运算
	fmt.Println("商:", quotient)         // 输出：商: }
}

在 Go 语言中，整数之间的相除运算结果会被截断为整数，不会保留小数部分。这是因为在整数之间进行除法运算时，Go 语言会对结果进行向零取整（向下取整）的操作，返回一个整数结果。

如果你想要得到浮点数的结果，可以将其中一个操作数转换为浮点数类型，这样整数除以浮点数的运算会得到浮点数结果。

以下是一个示例：

package main

import "fmt"

func main() {
	a := 10
	b := 3

	quotient := float64(a) / float64(b) // 将操作数转换为浮点数，进行相除运算
	fmt.Println("商:", quotient)        // 输出：商: 3.3333333333333335
}

在上述示例中，通过使用 float64(a) 和 float64(b) 将操作数转换为浮点数类型，然后进行相除运算，得到了浮点数商。最后使用 fmt.Println 输出了结果。

需要注意的是，如果参与相除运算的两个操作数都是整数类型，则结果仍然是整数类型，即使使用浮点数类型接收也只会截断小数部分。为了得到浮点数的精确结果，至少其中一个操作数需要是浮点数类型。

go是强类型语言啊…..必须显式转换，你想一下，两个int一运算，出来个float？？这不合适吧。所以go这种强类型语言会向下截取，不要小数部分

如果是php,是会输出浮点数。 php是动态语言，因为弱类型，和lua，python，erlang一样。我刚才说php会出现这问题，是指的进一法和退一法，四舍五入，比如正常应该是4.7，最后得到的是4还是5，就是 ceil，floor，round

https://blog.csdn.net/lz0426001/article/details/42707599

输出0.5

带着问题出发:

https://floating-point-gui.de/

https://github.com/golang/go/issues/34756

value1 := strconv.FormatFloat(9.815, 'f', 2, 64)
fmt.Println(value1) //9.81

value2 := strconv.FormatFloat(9.185, 'f', 2, 64)
fmt.Println(value2) //9.19

为什么会出现这样的情况?

FormatFloat

// FormatFloat converts the floating-point number f to a string,
// according to the format fmt and precision prec. It rounds the
// result assuming that the original was obtained from a floating-point
// value of bitSize bits (32 for float32, 64 for float64).
//
// The format fmt is one of
// 'b' (-ddddp±ddd, a binary exponent),
// 'e' (-d.dddde±dd, a decimal exponent),
// 'E' (-d.ddddE±dd, a decimal exponent),
// 'f' (-ddd.dddd, no exponent),
// 'g' ('e' for large exponents, 'f' otherwise),
// 'G' ('E' for large exponents, 'f' otherwise),
// 'x' (-0xd.ddddp±ddd, a hexadecimal fraction and binary exponent), or
// 'X' (-0Xd.ddddP±ddd, a hexadecimal fraction and binary exponent).
//
// The precision prec controls the number of digits (excluding the exponent)
// printed by the 'e', 'E', 'f', 'g', 'G', 'x', and 'X' formats.
// For 'e', 'E', 'f', 'x', and 'X', it is the number of digits after the decimal point.
// For 'g' and 'G' it is the maximum number of significant digits (trailing
// zeros are removed).
// The special precision -1 uses the smallest number of digits
// necessary such that ParseFloat will return f exactly.
func FormatFloat(f float64, fmt byte, prec, bitSize int) string {
	return string(genericFtoa(make([]byte, 0, max(prec+4, 24)), f, fmt, prec, bitSize))
}

FormatFloat将浮点数f转换为字符串，
 根据格式fmt和precision prec。它四舍五入
 假定原始结果是从浮点数获得的结果
 bitSize位的值（float32为32，float64为64）。

 格式fmt是以下格式之一
 'b'（-ddddp±ddd，二进制指数），
 'e'（-d.dddde±dd，十进制指数），
 'E'（-d.ddddE±dd，十进制指数），
 'f'（-ddd.dddd，无指数），
 'g'（对于大指数而言为'e'，否则为'f'），
 'G'（对于大指数而言为'E'，否则为'f'），
 'x'（-0xd.ddddp±ddd，十六进制分数和二进制指数），或
 'X'（-0Xd.ddddP±ddd，十六进制分数和二进制指数）。

 precision prec控制位数（不包括指数）
 以“ e”，“ E”，“ f”，“ g”，“ G”，“ x”和“ X”格式打印。
 对于“ e”，“ E”，“ f”，“ x”和“ X”，它是小数点后的位数。
 对于“ g”和“ G”，它是有效数字的最大值（后跟
 零）。
 特殊精度-1使用最少的位数
 以便ParseFloat准确返回f。

FormatFloat四个参数,分别是:

float64类型的待处理值
格式fmt,如果是f表示无指数
prec,精度,如果传2,即表示保留小数点后2位
bitSize,只能传32或64

max(prec+4, 24)和genericFtoa

简单比较大小.

func max(a, b int) int {
	if a > b {
		return a
	}
	return b
}

比较传入的(精度值+4),和24的大小,取其中较大者. 即如果小数点后保留的位数小于20,这个值就是24

故而此时 make([]byte, 0, 24),就是初始化一个uint8类型,容量为24的切片


func genericFtoa(dst []byte, val float64, fmt byte, prec, bitSize int) []byte {
	var bits uint64
	var flt *floatInfo

	//爽哥注解
	print("传入的待创处理的float64类型的值:",val,"\n")

	switch bitSize {
	case 32:
		bits = uint64(math.Float32bits(float32(val)))
		flt = &float32info
	case 64:
		bits = math.Float64bits(val)
		flt = &float64info
	default:
		panic("strconv: illegal AppendFloat/FormatFloat bitSize")
	}

	//爽哥注解
	print("得到的bits的值:",bits,"\n")

	//爽哥注解
	//如果此时打印:

	//传入的待创处理的float64类型的值:+9.815000e+000
	//得到的bits的值:4621714971847588577
	//9.81

	//传入的待创处理的float64类型的值:+9.185000e+000
	//得到的bits的值:4621360313376933151
	//9.19


	neg := bits>>(flt.expbits+flt.mantbits) != 0
	exp := int(bits>>flt.mantbits) & (1<<flt.expbits - 1)
	mant := bits & (uint64(1)<<flt.mantbits - 1)

	switch exp {
	case 1<<flt.expbits - 1:
		// Inf, NaN
		var s string
		switch {
		case mant != 0:
			s = "NaN"
		case neg:
			s = "-Inf"
		default:
			s = "+Inf"
		}
		return append(dst, s...)

	case 0:
		// denormalized
		exp++

	default:
		// add implicit top bit
		mant |= uint64(1) << flt.mantbits
	}
	exp += flt.bias

	// Pick off easy binary, hex formats.
	if fmt == 'b' {
		return fmtB(dst, neg, mant, exp, flt)
	}
	if fmt == 'x' || fmt == 'X' {
		return fmtX(dst, prec, fmt, neg, mant, exp, flt)
	}

	if !optimize {
		return bigFtoa(dst, prec, fmt, neg, mant, exp, flt)
	}

	var digs decimalSlice
	ok := false
	// Negative precision means "only as much as needed to be exact."
	shortest := prec < 0
	if shortest {
		// Try Grisu3 algorithm.
		f := new(extFloat)
		lower, upper := f.AssignComputeBounds(mant, exp, neg, flt)
		var buf [32]byte
		digs.d = buf[:]
		ok = f.ShortestDecimal(&digs, &lower, &upper)
		if !ok {
			return bigFtoa(dst, prec, fmt, neg, mant, exp, flt)
		}
		// Precision for shortest representation mode.
		switch fmt {
		case 'e', 'E':
			prec = max(digs.nd-1, 0)
		case 'f':
			prec = max(digs.nd-digs.dp, 0)
		case 'g', 'G':
			prec = digs.nd
		}
	} else if fmt != 'f' {
		// Fixed number of digits.
		digits := prec
		switch fmt {
		case 'e', 'E':
			digits++
		case 'g', 'G':
			if prec == 0 {
				prec = 1
			}
			digits = prec
		}
		if digits <= 15 {
			// try fast algorithm when the number of digits is reasonable.
			var buf [24]byte
			digs.d = buf[:]
			f := extFloat{mant, exp - int(flt.mantbits), neg}
			ok = f.FixedDecimal(&digs, digits)
		}
	}
	if !ok {
		return bigFtoa(dst, prec, fmt, neg, mant, exp, flt)
	}
	return formatDigits(dst, shortest, neg, digs, prec, fmt)
}

20241030

package main

import (
	"fmt"
	"log"
	"math"
)

func main() {

	xchiPrice := 0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000005

	xchiPrice = 0
	log.Printf("xICHI price: %.2f", xchiPrice)

	xichiRatio := math.Abs(12000000) / xchiPrice
	log.Printf("xichiRatio : %.2f", xichiRatio)

}

1
2
3

2024/10/30 10:50:08 xICHI price: 0.00
2024/10/30 10:50:08 xichiRatio : +Inf
+Inf

chiPrice的价格得无限趋近于0才会是+Inf

浮点数的上限非常大,

golang中何时会变为 +Inf

在 Go 语言中，使用 math/big 包或标准的浮点数类型（如 float32 和 float64）时，数值可能会变为 +Inf（正无穷大）在以下几种情况：

1. 除以零

当一个非零的浮点数被零除时，会产生 +Inf：

1	result := 1.0 / 0.0 // result 将是 +Inf

2. 超出浮点数可表示的最大值

当计算的结果超出了浮点数类型能表示的最大值时，结果会变为 +Inf：

1 2	maxFloat := math.MaxFloat64 result := maxFloat * 2 // result 将是 +Inf

3. 大数相加

如果两个非常大的浮点数相加，可能也会导致结果溢出，变为 +Inf：

1
2
3

a := math.MaxFloat64
b := math.MaxFloat64
result := a + b // result 将是 +Inf

4. 使用 `big.Float` 时

在使用 math/big 包时，相似的操作也会导致结果为 +Inf：

除以 big.NewFloat(0)。
计算超出 big.Float 可表示的范围。

示例代码

以下是一个简单的示例，演示了如何在 Go 中处理这些情况：

package main

import (
	"fmt"
	"math"
)

func main() {
	// 除以零
	fmt.Println(1.0 / 0.0) // +Inf

	// 超出最大值
	fmt.Println(math.MaxFloat64 * 2) // +Inf

	// 大数相加
	a := math.MaxFloat64
	b := math.MaxFloat64
	fmt.Println(a + b) // +Inf
}

注意事项

在进行浮点数计算时，特别是在可能导致溢出的情况下，应该小心处理，并且在必要时进行检查，以避免产生 +Inf。

浮点数可表示的最大值是多少

在 Go 语言中，浮点数的可表示最大值取决于使用的具体类型。Go 语言主要有两种浮点数类型：float32 和 float64。

1. `float32`

最大值：math.MaxFloat32
值：3.4028235e+38

2. `float64`

最大值：math.MaxFloat64
值：1.7976931348623157e+308

示例代码

你可以使用以下代码来查看这些最大值：

package main

import (
	"fmt"
	"math"
)

func main() {
	fmt.Println("Max Float32:", math.MaxFloat32) // 3.4028235e+38
	fmt.Println("Max Float64:", math.MaxFloat64) // 1.7976931348623157e+308
}

总结

float32 的最大值大约为 3.4 × 10^38。
float64 的最大值大约为 1.8 × 10^308。

由于 float64 能表示的范围更广，通常在需要高精度计算时推荐使用 float64。

如果a声明是浮点型，就算是0，也不会panic(有点没想到..)。。但是直接fmt.Println(1 / 0)会 invalid operation: division by zero

package main

import (
	"fmt"
)

func main() {

	var a float64

	a = 0
	fmt.Println(1 / a) // +Inf
	// fmt.Println(1 / 0)  // invalid operation: division by zero

}

原文链接: https://dashen.tech/2020/12/02/Go源码分析之浮点型处理/

版权声明: 转载请注明出处.

清澄秋爽

苹果树下的思索者书写是对思维的缓存

Go源码分析之浮点型处理

相除和取模有什么区别？

golang这为什么相除不会出现小数

FormatFloat

max(prec+4, 24)和genericFtoa

golang中何时会变为 +Inf

1. 除以零

2. 超出浮点数可表示的最大值

3. 大数相加

4. 使用 `big.Float` 时

示例代码

注意事项

浮点数可表示的最大值是多少

1. `float32`

2. `float64`

示例代码

总结

文章目录

相除和取模有什么区别？

golang这为什么相除不会出现小数

FormatFloat

max(prec+4, 24)和genericFtoa

golang中何时会变为 +Inf

1. 除以零

2. 超出浮点数可表示的最大值

3. 大数相加

4. 使用 big.Float 时

示例代码

注意事项

浮点数可表示的最大值 是多少

1. float32

2. float64

示例代码

总结

文章目录

4. 使用 `big.Float` 时

浮点数可表示的最大值是多少

1. `float32`

2. `float64`